Senior/Principal AI Systems Engineer
Senior / Principal AI Systems Engineer
LLM Agents, RAG, Decision Intelligence, Knowledge Systems
For our client, a US-oriented technology startup building a new generation of AI products, we are looking for a Senior / Principal AI Systems Engineer. This is not a simple chatbot project. This is not a role focused only on prompt engineering. We are looking for someone who can design and build production-grade AI systems that solve complex decision-making problems using LLMs, agentic workflows, RAG, memory systems, explainable scoring, feedback loops and business intelligence layers.
About the project
The project involves building an advanced AI platform that supports the analysis of complex data, assessment of fit, recommendation of the best possible decisions and continuous learning from user feedback.
The system is intended to operate as an intelligence layer on top of data, workflows and business decisions. Key areas include data structuring, reasoning, scoring, evidence, memory, human-in-the-loop workflows and continuous improvement of recommendations based on real decisions and outcomes.
This is a project for someone who wants to build something more ambitious than a classic SaaS application or another wrapper around a model API.
Responsibilities
You will co-create the architecture and first versions of the AI product, especially in the following areas:
designing LLM-based agentic workflows,
building an AI orchestration system for multiple specialised modules,
designing explainable scoring and recommendation mechanisms,
creating data structures for analysis, assessment and matching,
building a memory system / decision memory layer,
designing feedback loops that improve future system recommendations,
working with RAG, vector search, embeddings and knowledge representation,
integrating LLMs with backend services, databases and external tools,
designing AI outputs in a structured, testable and auditable way,
implementing AI observability, evaluations and prompt/model versioning mechanisms,
contributing to MVP scope definition, architecture, roadmap and technical risk assessment.
What we are looking for
We are looking for someone who can independently translate a complex business problem into a working technical architecture.
Ideally, you have experience with several of the following areas:
LLM applications,
AI agents / agentic workflows,
RAG / GraphRAG / vector search,
embeddings and vector databases,
knowledge graphs or knowledge representation,
explainable AI / explainable scoring,
decision intelligence / recommendation systems,
feedback loops,
memory systems,
structured outputs,
production-grade prompt engineering,
AI evaluations,
LLM observability,
backend architecture,
SaaS architecture,
data pipelines,
integrations,
security and auditability.
Technologies that may be useful
We do not require experience with this exact stack, but we are particularly interested in people who have worked with:
Python / FastAPI,
TypeScript / Next.js / React,
PostgreSQL,
pgvector / Qdrant / Pinecone / Weaviate,
LangGraph / LangChain / LlamaIndex / OpenAI Agents SDK / PydanticAI,
OpenAI / Anthropic / Gemini / Mistral / open-source LLMs,
Docker,
Kubernetes,
AWS / GCP / Azure / Vercel,
Sentry / Langfuse / OpenTelemetry / Braintrust / DeepEval,
REST / GraphQL / event-driven architecture.
The final stack is open for discussion. What matters more than any specific tool is your ability to build stable, testable and scalable AI systems.
Who this role is not for
This role is not a good fit for someone who:
has only built simple chatbots,
knows prompt engineering but not backend or architecture,
does not understand data-heavy systems,
does not want to take ownership of the product,
expects a fully defined step-by-step specification,
is uncomfortable working with ambiguity,
wants only to execute tasks without influencing architecture.
What would be a strong advantage
A strong advantage would be experience building systems that:
analyse data from multiple sources,
create recommendations or rankings,
justify their outputs with evidence,
learn from feedback,
support human decision-making,
have an auditable decision process,
run in production, not only as demos.
We are especially interested in people who understand the difference between:
an AI demo and a production AI system,
a prompt and an agent,
an LLM judge and an explainable scoring engine,
a simple RAG pipeline and a system with memory and feedback,
a chatbot and a decision intelligence platform.
Senior/Principal AI Systems Engineer
Senior/Principal AI Systems Engineer