#1 Job Board for tech industry in Europe

Inference Platform Engineer (LLM & Kubernetes)

6 000 - 8 200 USDNet per month - B2B

5 000 - 6 650 USDGross per month - Permanent

Python

Inference Platform Engineer (LLM & Kubernetes)

Python

Zabłocie 43A, 30-701, Kraków

N-iX

Full-time

B2B, Permanent

Senior

Remote

6 000 - 8 200 USD

Net per month - B2B

5 000 - 6 650 USD

Gross per month - Permanent

Job description

N-iX is a global software development service company that helps businesses across the globe create next-generation software products. Founded in 2002, we unite 2,400+ tech-savvy professionals across 40+ countries, working on impactful projects for industry leaders and Fortune 500 companies. Our expertise spans cloud, data, AI/ML, embedded software, IoT, and more, driving digital transformation across finance, manufacturing, telecom, healthcare, and other industries. Join N-iX and become part of a team where your ideas make a real impact.

We are looking for an Inference Platform Engineer (LLM & Kubernetes) to join our team.

Our client is a leading European AI company developing large language models and generative platforms for enterprise and government clients.
Their products combine high-performance technologies, transparency, accessibility, and data security, fully aligned with European regulatory and ethical standards.

As an Inference Platform Engineer (LLM & Kubernetes), you will take ownership of inference API integration, operations, and platform reliability across production AI systems.
This role is designed to be covered by 1–2 FTE split across several senior specialists, ensuring continuity of inference services and full coverage during planned and unplanned absences as we take over end-to-end LLM inference responsibility.

Responsibilities:

Take ownership of inference API integration, orchestration, and long-term platform reliability
Lead operations for LLM inference services as they transition under internal ownership
Ensure inference API availability, latency, and performance in production environments
Design and maintain multi-turn conversation handling, chat templates, and prompt orchestration
Proactively monitor, troubleshoot, and resolve inference platform issues, logs, and errors
Manage Kubernetes deployments, Helm charts, and ArgoCD workflows for inference services
Ensure platform security, CVE monitoring, and compliance with internal and regulatory standards
Collaborate closely with backend, platform, and infrastructure teams
Maintain clear operational documentation to support shared ownership across multiple FTEs
Participate in support / on-call rotation as required to ensure operational continuity

Requirements:

5+ years of Python programming experience
Strong Kubernetes (k8s) experience, including deployment, scaling, and monitoring
Experience handling large-scale logs, monitoring, and observability in production
Basic knowledge of LLM fundamentals and the surrounding industry (e.g., what type of models exist, how does an LLM generate output)
Experience from the user side developing against an Inference API (e.g., OpenAI, Anthropic, OpenRouter etc.) and understanding of their structure (experience with providing or deploying a similar API yourself a strong plus)
Ability to independently own and operate inference services in a shared-responsibility model (1–2 FTE split across multiple specialists)
Strong communication skills and experience working with cross-functional engineering teams
Solid Linux fundamentals

Nice to have:

Hands-on experience with Helm charts, ArgoCD, and CI/CD for AI services
Interest in partly working with Rust
Senior-level experience with production LLM inference or AI platform operations
Experience building or operating multi-turn conversational AI systems
Familiarity with real-time API orchestration or streaming inference workloads
Background in MLOps, AI platform engineering, or SRE
Experience with cloud-based inference deployments and scaling
Knowledge of security, CVE scanning, and operational best practices

Technology Stack:

Inference: OpenAI, Anthropic, or other LLM inference APIs
Focus Areas: API integration, multi-turn conversation orchestration, tool calling, platform reliability
Infrastructure: Kubernetes, Helm, ArgoCD, cloud or hybrid environments
Monitoring: Logs, metrics, observability tools for inference systems
Workflow: Git, CI/CD pipelines, documentation, operational runbooks, incident handling
Standards: Reliability, latency, performance, security, maintainability

Tech stack

English

Kubernetes

master

Python

master

LLM

advanced

Linux

advanced

Inference API

regular

OpenAI

regular

Anthropic

regular

Office location

Published: 13.02.2026

Inference Platform Engineer (LLM & Kubernetes)

6 000 - 8 200 USDNet per month - B2B

Summary of the offer

Inference Platform Engineer (LLM & Kubernetes)

Zabłocie 43A, 30-701, Kraków

N-iX

6 000 - 8 200 USDNet per month - B2B

5 000 - 6 650 USDGross per month - Permanent

By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. N-iX Krakow, Zabłocie Business Park business centre, Zabłocie 43A, 30-701This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Check similar offers