We are looking for a passionate AI/ML Engineer with 3+ years of experience who can design, train, fine-tune, and deploy machine learning models, while building intelligent, data-driven applications using modern GenAI tools such as Hugging Face, OpenAI, and LangChain. The role focuses on Python-based backend, FastAPI services, LLM integrations, and end-to-end ML workflows.
Responsibilities :
- Design, train, and deploy machine learning and deep learning models for NLP, vision, or recommendation systems.
- Build robust APIs for serving ML/AI models using Python FastAPI.
- Fine-tune and serve Hugging Face Transformers and LLMs (BERT, GPT, Llama, etc.).
- Build data ingestion, preprocessing, and transformation pipelines using Python and Pandas.
- Integrate LLMs, embeddings, and AI agents using LangChain, LlamaIndex, or OpenAI API.
- Work on chatbots, RAG (Retrieval-Augmented Generation) systems, and AI assistants.
- Optimize model performance, including quantization and inference improvements.
- Collaborate with cross-functional teams (data, product, frontend) to build and ship AI-driven features.
Requirements :
- Strong experience with Python, TensorFlow, PyTorch, and scikit-learn.
- Expertise in FastAPI and building inference APIs.
- Strong understanding of LLM architectures, embeddings, and vector databases like Pinecone, FAISS, or Milvus.
- Hands-on experience with Hugging Face Transformers, LangChain, OpenAI API, and LlamaIndex.
- Proficiency in Pandas, NumPy, and working with SQL/NoSQL databases.
- Experience building chatbots, RAG pipelines, and conversational AI systems.
- Familiarity with FastAPI async endpoints for real-time inference.
- Experience optimizing models for faster inference.
- B.Tech / M.Tech / MCA in Computer Science, Artificial Intelligence, or related field.