QA and Performance Testing Engineering Lead
QA & Performance Engineering Lead (AI/LLM Focus)
The Role
We are seeking a high-caliber QA and Performance Engineering Lead to spearhead the testing strategy for enterprise-grade AI and LLM solutions. In this role, you will define the architecture for functional, non-functional, and performance testing, ensuring that complex AI agent workflows and large-scale applications meet the highest standards of reliability and compliance. You will act as a bridge between traditional QA excellence and the cutting-edge requirements of GenAI evaluation.
Core Responsibilities & Technical Expertise
Strategic QA Leadership: Leverage 10+ years of experience leading enterprise-wide testing initiatives within Fortune 500 environments to design comprehensive QA architectures.
AI/LLM Specialized Evaluation: Implement advanced metrics for model assessment, including BLEU, ROUGE, perplexity, and specialized scoring for hallucination and grounding rates.
Performance & Resilience Engineering: Build frameworks for load, stress, and chaos testing to ensure system stability under extreme conditions and peak workloads.
Automation & Orchestration: Engineer robust CI/CD test pipelines using Azure DevOps or GitHub Actions, focusing on automated API testing (Pytest/Postman) and integrated test harnesses.
Agentic Workflow Validation: Design testing strategies for multi-step AI agents, covering tool chaining, orchestration, and context injection accuracy.
Data Governance & Compliance: Apply deep knowledge of data lineage (Purview/Unity Catalog) and maintain strict traceability and auditability standards required in regulated industries.
Lifecycle Management: Oversee model release gates, registry promotions, and the management of synthetic datasets and versioning.
Key Deliverables
Unified Testing Framework: A standardized taxonomy and coverage model spanning unit, integration, E2E, and AI agent workflows.
AI Evaluation Suite: A comprehensive suite for validating model consistency, toxicity, and correctness, supported by Proof-of-Concept (PoC) validations.
Automated Performance Harness: Scalable workload models designed for peak-load scenarios and resiliency benchmarking.
Smart Quality Gates: Automated pass/fail scoring mechanisms embedded directly into release pipelines across all quality dimensions.
Advanced Observability: Implementation of "Golden Dashboards" tracking real-time metrics such as latency-per-thought, grounding quality, and functional pass rates.
Professional Profile
Expertise in Enterprise QA Architecture (Functional + Non-functional + Performance).
Deep understanding of ML/LLM lifecycle and model promotion pipelines.
Strong background in Regulated Industries (ensuring compliance and audit readiness).
Hands-on experience with Synthetic Data generation and dataset versioning.
QA and Performance Testing Engineering Lead
QA and Performance Testing Engineering Lead