We are looking for a skilled QA Engineer with experience in testing Large Language Model (LLM) applications to ensure the reliability, accuracy, and performance of AI-driven solutions. The ideal candidate should have hands-on experience with LLM evaluation frameworks, including DeepEval (DeevEval), and testing AI/ML-based applications, with a strong focus on quality assurance in generative AI environments.

Key Responsibilities :

- Test and validate LLM-based applications for accuracy, reliability, and performance

- Design and execute test cases for generative AI workflows and NLP applications

- Work with LLM evaluation frameworks (e.g., DeepEval / DeevEval) to measure model quality

- Validate prompts, responses, hallucination risks, and output consistency

- Collaborate with AI engineers, data scientists, and developers to improve model quality

- Perform functional, regression, and performance testing for AI-driven applications

- Document testing methodologies, results, and improvement recommendations

- Identify risks, bugs, and optimization opportunities in AI workflows

Mandatory Skills :

- Experience in testing LLM applications or AI-driven platforms

- Should have 1- 4 years of experience in similar role

- Hands-on experience with DeepEval / DeevEval or similar LLM evaluation frameworks