- GenAI Solution Delivery : Design, build, and release enterprise-grade GenAI systems, including RAG pipelines, Agentic AI systems, multimodal LLMs, and foundation models.

- Agentic AI : Minimum 6-12 months of experience in developing, deploying, and managing Agentic AI systems, with a clear understanding of tool usage, structured outputs, speculative decoding, AST-Code RAG, streaming, and sync/async processing.

- Fine-Tuning and PEFT : Hands-on expertise in PEFT methods (LoRA / QLoRA) for model fine-tuning and optimization.

- Embedding Models : Strong knowledge of embedding models, chunking strategies, and their limitations when used in RAG pipelines.

- Hands-on Coding : Write, test, and maintain clean, efficient, and scalable code in Python for building NLP and AI systems.

- Cloud and Deployment : Deep familiarity with Azure and proven experience in deploying LLMs for large-scale inference using LLM Ops techniques and orchestration frameworks.

- Tech Stack Proficiency : Strong expertise in PyTorch, TensorFlow, Kubernetes, Docker, Llama Index, Lang Chain, and Lang Graph.

- Innovation and Research : Stay updated on the latest advancements in AI agents, LLM architectures, and orchestration tools, experimenting with emerging techniques to enhance system performance.

- Communication : Strong interpersonal and communication skills, with the ability to design solutions and explain complex AI concepts to both technical and business stakeholders.

Requirements :

- 6-12 years of overall experience with a strong balance of business and technical acumen.

- 3+ years of GenAI development and deployment experience.

- Min 6-12 months of Agentic AI development experience.

- Strong Python development skills.

- Proven experience in RAG pipelines, embeddings, and chunking strategies.

- Expertise in LoRA / QLoRA fine-tuning.

- Hands-on coding for NLP and LLMs.

- Proficiency in PyTorch, TensorFlow, LangChain, LangGraph, and LlamaIndex.

- Deep familiarity with Azure Cloud, LLMOps, orchestration, and large-scale inference.

- Knowledge of speculative decoding, AST-Code RAG, structured outputs, streaming, and async processing.