Primary Title : Senior LLM Engineer (4+ years) - Hybrid, India.

About The Opportunity :

A technology consulting firm operating at the intersection of Enterprise AI, Generative AI and Cloud Engineering seeks an experienced LLM-focused engineer.

You will build and productionize LLM-powered products and integrations for enterprise customers across knowledge management, search, automation, and conversational AI use-cases.

This is a hybrid role based in India for candidates with strong hands-on LLM engineering experience.

Role & Responsibilities :

- Own design and implementation of end-to-end LLM solutions : data ingestion / retrieval (RAG) / fine-tuning / inference and monitoring for production workloads.

- Develop robust Python microservices to serve LLM inference, retrieval, and agentic workflows using LangChain/LangGraph or equivalent toolkits.

- Implement and optimise vector search pipelines (FAISS/Pinecone/Milvus), embedding generation, chunking strategies, and relevance tuning for sub-second retrieval.

- Perform parameter-efficient fine-tuning (LoRA/adapters) and evaluation workflows; manage model versioning and automated validation for quality and safety.

- Containerise and deploy models and services with Docker and Kubernetes; integrate with cloud infra (AWS/Azure/GCP) and CI/CD for repeatable delivery.

- Establish observability, alerting, and performance SLAs for LLM services; collaborate with cross-functional teams to define success metrics and iterate rapidly.

Skills & Qualifications :

Must-Have :

- 4+ years engineering experience with 2+ years working directly on LLM/Generative AI projects.

- Strong Python skills and hands-on experience with PyTorch and HuggingFace/transformers libraries.

- Practical experience building RAG pipelines, vector search (FAISS/Pinecone/Milvus), and embedding workflows.

- Experience with fine-tuning strategies (LoRA/adapters) and evaluation frameworks for model quality and safety.

- Familiarity with Docker, Kubernetes, cloud deployment (AWS/Azure/GCP), and Git-based CI/CD workflows.

- Solid understanding of prompt engineering, retrieval strategies, and production monitoring of ML services.

Preferred :

- Experience with LangChain/LangGraph, agent frameworks, or building tool-calling pipelines.

- Exposure to MLOps platforms, model registry, autoscaling low-latency inference, and cost-optimisation techniques.

- Background in productionising LLMs for enterprise use-cases (knowledge bases, search, virtual assistants).

Benefits & Culture Highlights :

- Hybrid work model with flexible in-office collaboration and remote days; competitive market compensation.

- Opportunity to work on high-impact enterprise AI initiatives and shape production-grade GenAI patterns across customers.

- Learning-first culture : access to technical mentorship, experimentation environments, and conferences/learning stipend.

To apply : include a brief portfolio of LLM projects, links to relevant repositories or demos, and a summary of production responsibilities.

This role is ideal for engineers passionate about turning cutting-edge LLM research into reliable, scalable enterprise solutions.