Posted on: 27/01/2026
Description :
A fast-scaling organization in the Enterprise AI & Machine Learning sector, focused on building production-grade Large Language Model (LLM) solutions, Retrieval-Augmented Generation (RAG) systems, and real-time intelligent search for B2B customers.
The team delivers low-latency inference, scalable vector search, and robust MLOps for mission-critical applications.
Primary role title : Senior Machine Learning Engineer (LLM & RAG).
Location : Pune, India On-site.
Role & Responsibilities :
- Design, build, and productionize end-to-end LLM & RAG pipelines : data ingestion, embedding generation, vector indexing, retrieval, and inference integration.
- Implement and optimize vector search solutions using FAISS/Pinecone and integrate with prompt orchestration frameworks (e.g., LangChain).
- Optimize model serving for latency and cost : batching, quantization, ONNX/Triton deployment, and autoscaling on Kubernetes.
- Develop robust microservices and REST/gRPC APIs to expose inference and retrieval capabilities to product teams.
- Establish CI/CD, monitoring, and observability for ML models and pipelines (model validation, drift detection, alerting).
- Collaborate with data scientists and platform engineers to iterate on model architectures, embeddings, and prompt strategies; mentor junior engineers.
Skills & Qualifications :
Must-Have :
- PyTorch.
- Hugging Face Transformers.
- LangChain.
- Retrieval-Augmented Generation.
- FAISS.
- Pinecone.
- Docker.
- Kubernetes.
Preferred :
- Triton Inference Server.
- Apache Kafka.
- Model quantization.
Qualifications : 6-9 years of hands-on experience in ML/LLM engineering with a strong track record of shipping production ML systems.
- Comfortable working on-site in Pune.
- Strong software engineering fundamentals and experience collaborating across product and data teams.
Benefits & Culture Highlights :
- Opportunity to lead end-to-end LLM projects and shape AI product direction in a growth-stage engineering team.
- Collaborative, fast-paced environment with mentorship, tech ownership, and exposure to modern MLOps tooling.
- Competitive compensation, professional development budget, and on-site engineering culture in Pune.
- To apply, bring strong LLM production experience, demonstrable RAG implementations, and a bias for scalable, maintainable systems.
- Join an engineering-first team building the next generation of AI-powered enterprise features.
Did you find something suspicious?