HamburgerMenu
hirist

Job Description

Description :



- Location : Bangalore (Work from Office)



- Experience : 1- 4 Years



Key Responsibilities :



- Test and validate AI agents and LLM workflows to ensure correctness and stability.



- Design and execute test cases to evaluate agentic behavior, prompt responses, and task accuracy.



- Use DeepEval to create and automate evaluation frameworks for LLM and agent testing.



- Identify, document, and track bugs, performance issues, or logic errors.



- Collaborate with developers, prompt engineers, and data scientists to improve system quality.



- Develop and maintain test documentation, including plans, cases, and reports.



- Ensure AI agents handle contexts, prompts, and tasks as intended in real-world scenarios.



- Contribute to building automated QA pipelines for continuous testing and monitoring of agents.



Required Skills & Qualifications :



- 1- 4 years of experience in QA, software testing, or AI testing.



- Strong understanding of Generative AI, LLM agents, and autonomous workflows.



- Experience with manual testing and familiarity with automation testing frameworks.



- Hands-on experience or working knowledge of DeepEval for evaluating LLM outputs.



- Good analytical, debugging, and documentation skills.



- Familiarity with Java or Python for writing scripts or automation utilities.



- Excellent collaboration and communication abilities.



Good to Have :



- Experience with LangChain, CrewAI, AutoGen, or similar agentic frameworks.



- Understanding of AI evaluation metrics (accuracy, coherence, relevance, hallucination detection).



- Exposure to CI/CD, test automation, or model monitoring tools.


info-icon

Did you find something suspicious?