- The role Youll design, build, and productionize AI features end-to-endmodel selection/finetuning, retrieval pipelines, evaluations, and the serving layerworking closely with product and platform teams. If you enjoy turning ambiguous problems into shipping systems, youll feel at home here.

What youll do Own AI feature delivery from prototype ?

- Build RAG pipelines (chunking, embeddings, vector stores), prompt/program orchestration, and guardrails.

- Fine-tune and/or distill models (open/closed source) for classification, generation, and tool-use.

- Implement robust offline & online evals (unit evals, golden sets, regression tests, user-feedback loops).

- Ship reliable services : APIs, workers, model servers, and monitoring/observability (latency, cost, quality).

- Partner with product/design to shape problem statements, success metrics, and experiment plans.

- Champion engineering best practices (reviews, testing, docs, incident learnings).

Requirements :

- Tech you might use here :

- Languages : Python, TypeScript/Node.

- AI/ML : PyTorch, Hugging Face, OpenAI/Anthropic/other LLM APIs, vLLM/TensorRT-LLM, LangChain/LlamaIndex (pragmatically).

- Data & Retrieval : Postgres, Redis, Milvus/pgvector/Weaviate, Kafka.

- Infra : Docker, Kubernetes, CI/CD, Grafana/Prometheus, cloud (AWS/GCP).

- Quality : Prompt/unit tests, offline eval harnesses, canary analysis, A/B testing.

- Were looking for 3 to 7+ years of software engineering experience, with 13+ in applied ML/LLM or search/retrieval.

- Strong Python engineering (typing, testing, packaging) and service design (APIs, queues, retries, idempotency).

- Hands-on with at least two of : RAG in prod, finetuning (LoRA/QLoRA), embeddings/annoy/hnsw, function/tool calling, or model serving at scale.

- Practical evaluation mindset : create golden datasets, design metrics (accuracy, faithfulness, toxicity, latency, cost).

- Product sense and ownership : you measure impact, not just model scores.

- Clear communication and collaborative habits (PRs, design docs, incident notes).

Nice to have :

- Experience with multi-tenant architectures, RBAC/ABAC, and data governance.

- Safety & reliability work (red-teaming, jailbreak defenses, PII handling).

- Frontend familiarity (React) to iterate quickly on UX for AI features.

- Prior startup experience or 0?1 product building.

What success looks like (first 90 days) :

- Ship a scoped AI feature into customer hands with an eval harness and dashboards.

- Reduce either latency or cost of an existing pipeline by ~2030% without quality loss.

- Add at least one reusable internal component (chunker, ranker, guardrail, eval set).

Interview process :

- Intro chat (30 min) : role fit & expectations.

- Technical deep-dive (60 min) : systems + ML/LLM problem solving.

- Practical exercise (take-home or pairing, 34 hrs) : build a small RAG/eval pipeline.

- Final loop (6090 min) : product & culture, past work, offer Q&A.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Harish

HR at Kognito Kube Private Limited

Last Active: 5 Feb 2026

Job Views:
321

Applications: 121

Recruiter Actions: 102

Posted in

AI/ML

Functional Area

ML / DL / AI Research

Job Code

1604007

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers