Description :

Job Overview :

We are seeking a highly skilled Senior Machine Learning Engineer with advanced expertise in classical ML, deep learning, generative AI, graph-based learning, and large language model (LLM) systems.

The role requires hands-on experience designing end-to-end machine learning solutions, architecting scalable AI pipelines, training large models, and deploying production-grade systems on modern cloud and MLOps infrastructure.

The ideal candidate is an engineering-driven professional capable of building robust, automated, high-performance AI products.

Key Responsibilities :

ML, DL, GenAI & Graph ML Engineering :

- Build, train, and optimize machine learning and deep learning models for supervised, unsupervised, reinforcement learning, forecasting, generative AI, and graph-based use cases.

- Develop advanced architectures such as transformers, GNNs, diffusion models, encoder-decoder systems, and self-supervised learning pipelines.

- Implement scalable model training strategies including distributed training, mixed-precision training, hyperparameter optimization, and model parallelism.

End-to-End AI Pipeline Development :

- Design and build production-grade pipelines covering data ingestion, preprocessing, feature engineering, model training, validation, packaging, deployment, and monitoring.

- Implement automated CI/CD workflows for ML using MLOps frameworks and cloud-native tooling.

- Build reusable components for model retraining, A/B testing, and continuous learning.

LLM, RAG & GenAI Systems :

- Develop LLM-based applications using fine-tuning, prompt engineering, adapters, LoRA, PEFT, and vector database integrations.

- Build Retrieval-Augmented Generation (RAG) systems, including document chunking, embedding pipelines, semantic search, and knowledge-grounded reasoning layers.

- Evaluate and benchmark open-source and proprietary LLMs for specific workloads.

Model Deployment & Operationalization :

- Deploy models on cloud ML platforms with containerized services, inference optimization, and scalable serving endpoints.

- Implement model monitoring, anomaly detection, concept drift detection, explainability, and observability dashboards.

- Optimize inference latency, throughput, caching strategies, and hardware utilization (GPU/TPU).

Data & Infrastructure Collaboration :

- Work closely with data engineering teams to design scalable data architectures, ETL pipelines, and feature stores.

- Ensure high-quality data availability, lineage tracking, and versioning for reproducible ML experiments.

Performance & Systems Optimization :

- Optimize GPU/TPU usage, distributed compute strategies, memory management, and resource scheduling.

- Conduct performance profiling and optimize training/inference workflows at both model and system level.

Core Technical Skills :

Machine Learning & Deep Learning :

- Strong proficiency in classical ML algorithms, deep neural networks, GNNs, transformers, and generative models.

- Experience with large-scale model training, fine-tuning, and evaluation.

Gen AI, LLMs & RAG :

- Hands-on experience with LLM frameworks, prompt engineering, vector databases, embedding models, and RAG workflows.

- Understanding of tokenization, attention mechanisms, chaining mechanisms, and LLM optimization techniques.

ML Engineering, MLOps & Infrastructure :

- Strong experience with Docker, Kubernetes, model registries, CI/CD pipelines, experiment tracking, and distributed systems.

- Familiarity with ML platforms such as Databricks, SageMaker, Vertex AI, Azure ML, or similar.

Programming & Data Systems :

- Expertise in Python and libraries such as PyTorch, TensorFlow, JAX, HuggingFace ecosystem, Scikit-learn.

- Strong SQL skills and experience with data warehouses, data lakes, and real-time data pipelines.

Tools & Technologies :

- TensorFlow, PyTorch, JAX

- HuggingFace, LangChain, LlamaIndex

- MLflow, Kubeflow, Airflow, Feast

- Vector DBs : Pinecone, Weaviate, FAISS, Milvus

- Cloud : AWS, GCP, Azure

- Containerization & orchestration : Docker, Kubernetes

Qualifications :

- Bachelors or Masters degree in Computer Science, Data Science, AI/ML, or related technical field.

- 5+ years of experience in ML Engineering, with proven hands-on work in production-grade AI systems.

- Experience building and deploying LLM or GenAI solutions at scale is highly desirable