Description :

We are seeking a seasoned Senior QA Engineer to specialize in the quality assurance of our Generative AI and Agentic applications. This pivotal role requires a deep technical background in backend systems, API integration, and data validation, combined with hands-on experience in testing LLMs and autonomous agents. You will be instrumental in developing evaluation strategies, red teaming our models, and ensuring the reliability and safety of complex, non-deterministic systems.

Key Responsibilities :

- Agent Tool Use Validation : Verify the logic and data flow when AI Agents perform function calling to external APIs. Ensure the agent correctly formulates requests and securely handles responses, validating the integrity of complex, multi-step transactions.

- Adversarial & Exploratory Testing (Red Teaming) : Design and execute test scenarios focused on breaking the AI's logic by challenging safety guardrails, identifying hallucinations, and attempting prompt injection or jailbreaking attacks.

- RAG Pipeline Verification : Develop test plans for Retrieval Augmented Generation (RAG) systems, focusing on data consistency, vector index quality, and verifying that the AI accurately answers based only on the provided knowledge base.

- Automated Evaluation Harnesses : Develop and maintain automated test scripts (Evals) using Python to run large prompt sets and measure key AI metrics (accuracy, relevance, coherence, toxicity) against golden datasets.

- Backend System Testing : Apply expertise in API and data testing to the underlying infrastructure supporting the AI models, ensuring high performance, scalability, and security of the entire system.

- System Trace Analysis : Perform deep-dive analysis of agent execution logs ("Chain of Thought") to diagnose reasoning failures and pinpoint errors within the LLM, system prompt, or external integration points.

Required Qualifications :

1. Backend QA / SDET Experience (2 to 3 Years)

- API & Integration Expertise : At least 2 to 3 years of focused experience in API testing using tools like Postman. Must have a strong understanding of REST architecture, request/response payloads, and integration testing complex microservices.

- Scripting Proficiency : At least 2 years of experience in writing backend automation scripts using Python for logic verification and data processing.

- Data Validation : Strong experience with data testing, including advanced proficiency with SQL for querying databases, validating ETL processes, and ensuring data consistency.

- Testing Fundamentals : Expertise in building robust test cases, managing defect lifecycles, and working effectively within Agile methodologies.

2. Generative AI / Agentic AI Experience (1.5 to 2 Years)

- Hands-on AI Testing : At least 1.5 to 2 years of experience specifically testing applications built on Large Language Models (LLMs), Generative AI, or Autonomous Agents.

- AI Evaluation Metrics : Proven experience defining and implementing AI evaluation metrics such as factual consistency, contextual precision, and toxicity scores.

- Framework Familiarity : Practical experience with AI orchestration frameworks or libraries (e.g., LangChain, AutoGen, or Semantic Kernel) is highly preferred.

- Adversarial Mindset : Demonstrated experience in Red Teaming or designing security tests specifically for LLM-based applications.

Nice to Have :

- Experience working with Vector Databases (e.g., Pinecone, Weaviate, Milvus).

- Familiarity with MLOps tools or specialized evaluation platforms (e.g., LangSmith, DeepEval).

- Certification or project work related to AI safety and ethical guidelines.

- Experience load testing LLM endpoints to check latency and throughput.