We are looking for a GenAI Tester to validate, test, and assure the quality of Generative AI solutions including LLM-based applications, chatbots, copilots, and AI-driven decision systems.

The role involves testing model outputs, prompts, data pipelines, integrations, and governance controls to ensure accuracy, reliability, fairness, and compliance across enterprise use cases.

This position sits at the intersection of AI engineering, QA, risk, and business validation, requiring both technical depth and strong analytical thinking.

Key Responsibilities :

GenAI & Model Testing :

- Design and execute test strategies for Generative AI applications using LLMs (OpenAI, Anthropic, Google, open-source models).

- Validate prompt engineering, response accuracy, hallucination risks, context handling, and reasoning consistency.

- Perform functional, non-functional, and regression testing of AI-driven workflows.

- Evaluate model behavior across edge cases, ambiguous queries, and adversarial inputs.

Data & Output Validation :

- Test training, fine-tuning, and inference datasets for quality, bias, leakage, and completeness.

- Validate AI outputs for factual correctness, relevance, tone, safety, and compliance.

- Ensure alignment with business rules, domain knowledge, and regulatory expectations.

AI Risk, Bias & Ethics Testing :

- Identify and document bias, toxicity, unsafe content, and fairness issues.

- Test AI systems against responsible AI principles, governance frameworks, and audit requirements.

- Validate explainability, traceability, and reproducibility of AI outputs where applicable.

Integration & System Testing :

- Test integrations between GenAI models and enterprise systems (APIs, databases, workflows).

- Validate end-to-end AI pipelines including data ingestion, inference layers, and UI interactions.

- Work closely with AI engineers, data scientists, product owners, and business stakeholders.

Automation & Tooling :

- Develop automated test scripts for AI validation using Python or testing frameworks.

- Leverage AI testing tools, evaluation metrics, and benchmarking frameworks.

- Create reusable test datasets, prompt libraries, and evaluation scorecards.

Reporting & Governance :

- Create detailed defect reports, test summaries, risk assessments, and validation dashboards.

- Track KPIs such as accuracy, response latency, hallucination rate, and SLA adherence.

- Support audits, model reviews, and production readiness assessments.

Must-Have Skills & Qualifications :

- Strong experience in Software Testing / QA with exposure to AI or GenAI systems.

- Understanding of LLMs, prompt engineering, embeddings, RAG, and inference workflows.

- Hands-on experience with API testing, data validation, and integration testing.

- Strong analytical skills to assess AI output quality beyond traditional pass/fail testing.

- Excellent documentation and communication skills.

Technical Skills :

Programming/Scripting :

- Python (preferred), SQL

Testing Tools :

- Postman, PyTest, JUnit, Selenium (as applicable)

AI/ML Concepts :

- LLMs, NLP, model evaluation metrics, prompt testing

Platforms (any) :

- Azure OpenAI, AWS Bedrock, GCP Vertex AI, OpenAI APIs

DevOps/Tools :

- Git, CI/CD pipelines, Jira, Confluence

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Kiran

HR Manager at EverestDX

Last Active: 3 Feb 2026

Job Views:
134

Applications: 44

Recruiter Actions: 1

Posted in

Quality Assurance

Functional Area

QA & Testing

Job Code

1608285

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers