Posted on: 05/03/2026
Description :
Key Skills : Kubernetes, vLLM, ML Model deployment, CI/CD pipeline for Model tuning,
Model lifecycle management in Azure
About the Role :
We are looking for a hands-on MLOps Engineer who can take ML models from notebook chaos to production reality.
Youll own the end-to-end ML deployment pipeline from training orchestration to scalable inference using Kubernetes, vLLM, CI/CD automation, and Azure ML services.
If you love automating everything, optimising inference performance, and building rock-solid ML infrastructure, then this role is for you.
Key Responsibilities :
Model Deployment & Serving :
- Deploy and manage ML/LLM models in production using Kubernetes
- Implement high-performance inference pipelines using vLLM / Triton / REST & gRPC APIs
- Optimize model latency, throughput, GPU utilization and autoscaling
CI/CD for ML Pipelines :
- Build automated CI/CD pipelines for model training, tuning, validation and deployment
- Integrate GitHub Actions / Azure DevOps / Jenkins for continuous integration
- Enable automated rollback and versioning of ML models
Model Lifecycle Management :
- Manage full ML lifecycle including training, experiment tracking, model registry, deployment, monitoring and retraining
- Implement model version control, data drift detection and performance monitoring
- Automate scheduled retraining workflows
Azure Cloud & Infrastructure :
- Design and maintain ML infrastructure on Microsoft Azure (Azure ML, AKS, Blob Storage, ACR, Key Vault)
- Manage GPU clusters, networking, secrets and resource optimization
- Implement secure production-grade architecture
Monitoring & Reliability :
- Implement monitoring using Prometheus, Grafana, Azure Monitor
- Track inference metrics, system health and model performance
- Ensure high availability and fault tolerance
Required Skills :
Core MLOps Skills :
- Strong experience with Kubernetes (AKS preferred)
- Experience deploying ML/LLM models using vLLM / FastAPI / TorchServe / Triton
- Hands-on experience with CI/CD pipelines for ML workflows
- Experience with model versioning and lifecycle management tools
Cloud & DevOps :
- Strong hands-on with Azure Cloud services
- Experience with Docker, Helm, Terraform (good to have)
- Knowledge of infrastructure automation and scaling
Programming & Frameworks :
- Strong Python skills
- Experience with PyTorch / TensorFlow / HuggingFace
- REST API development for model serving
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
ML / DL Engineering
Job Code
1617958