QA Engineer
Andersen is hiring a QA Engineer to ensure quality and stability of a secure cloud platform with collaboration and analytics features, supporting continuous improvements and reliable product delivery.
The customer is an international company delivering professional and technology-enabled solutions that support effective collaboration, structured communication, and operational efficiency for organizations. It operates in a fast-growing environment, focusing on scalability, security, and continuous improvement while developing digital platforms used by diverse clients worldwide.
The project is focused on enhancing a secure, cloud-based board management platform with intuitive meeting tools, real-time collaboration, and advanced analytics. It also includes building and maintaining scalable AI infrastructure, orchestration patterns, and observability to ensure reliable and intelligent platform performance.
Responsibilities:
Designing test cases for non-deterministic AI systems.
Executing evaluation runs against AI workflows using LangFuse.
Validating AI outputs against acceptance criteria and business rules.
Building and maintaining regression test suites for AI features.
Identifying edge cases and failure patterns in AI behavior.
Triaging and documenting accuracy/quality issues.
Monitoring production quality metrics and flag degradation.
Partnering with AI Engineer on test data curation and quality thresholds.
Must-have:
Experience as a QA Engineer or in a similar role for 3+ years.
Experience testing AI/ML or other non‑deterministic systems, with an understanding of probabilistic outputs and variability in model behavior.
Solid knowledge of QA methodologies adapted for AI workflows, including evaluation‑based testing, scenario testing, and semantic validation.
Hands‑on experience designing test cases for systems without fixed expected outputs, using acceptance criteria, heuristics, and quality thresholds. Experience executing evaluation runs with
AI observability or evaluation tools (experience with LangFuse is a strong advantage).
Ability to validate LLM outputs against business rules, prompt requirements, safety guidelines, and product acceptance criteria.
Experience building and maintaining regression suites for AI features (prompt regressions, dataset-based regressions, workflow regressions).
Ability to identify edge cases, emergence patterns, and failure modes in AI behavior (e.g., hallucinations, inconsistency, bias, context loss).
Experience documenting issues with detailed repro steps, evaluation evidence, logs, and accuracy/quality metrics.
Ability to work with JSON, API tools (Postman, Swagger), and logs from AI systems.
Level of English – from Upper-Intermediate and above.
Nice to have:
Exposure to evaluation frameworks (LangFuse evals, Ragas, TruLens, or internal evaluators).
Understanding of key LLM quality metrics such as accuracy, precision/recall, semantic similarity scores, or custom evaluation metrics.
Familiarity with production monitoring dashboards or telemetry tools to detect quality degradation.
Basic understanding of LLMs, embeddings, RAG workflows, and prompt-based systems.
Experience with test data curation, labeling, or annotation processes.
Reasons why this job would be interesting to you:
Experience in teamwork with leaders in FinTech, Healthcare, Retail, Telecom, and others. Andersen cooperates with such businesses as Samsung, Siemens, Johnson & Johnson, BNP Paribas, Ryanair, Mercedes, TUI, Verivox, Allianz, T-Systems, etc..
The opportunity to change the project and/or develop expertise in an interesting business domain.
Job conditions – you can work both fully remotely and from the office or can choose a hybrid variant.
Guarantee of professional, financial, and career growth! The company has introduced systems of mentoring and adaptation for each new employee.
The opportunity to earn an additional up to 1,000 USD per month by participating in the company's activities.
Access to the corporate training portal, where the entire knowledge base of the company is collected and which is constantly updated.
Bright corporate life (parties / pizza days / PlayStation / fruits / coffee / snacks / movies).
Certification compensation (AWS, PMP, etc).
Referral program.
English courses.
Private health insurance and compensation for sports activities.
Join us!
Your personal data is protected in accordance with GDPR regulations. Learn more: https://andersenlab.com/privacy-policy/pl
QA Engineer
QA Engineer