Posted on: 21/07/2025
Job Description :
We are seeking a highly skilled MLOps Engineer with over 7 years of experience to join our Data & AI Engineering team.
The ideal candidate will be responsible for designing, implementing, and maintaining robust and scalable machine learning operations infrastructure.
This role bridges the gap between data science and engineering, ensuring reliable model deployment, monitoring, and automation.
Key Responsibilities :
- Design and build scalable, production-grade MLOps pipelines for model training, validation, deployment, and monitoring.
- Automate ML workflows using tools such as Kubeflow, MLflow, Airflow, or SageMaker Pipelines.
- Develop and maintain CI/CD pipelines tailored for ML models and data pipelines.
- Manage infrastructure on cloud platforms like AWS, Azure, or GCP for ML model lifecycle.
- Deploy models as APIs/microservices using Docker and orchestration tools like Kubernetes (K8s).
- Ensure scalability, availability, and fault-tolerance of ML applications in production.
- Set up monitoring for model performance, drift detection, and data quality validation.
- Implement model versioning, auditing, and reproducibility standards.
- Collaborate with Data Science teams to streamline model experimentation and reproducibility.
- Ensure secure handling of PII and compliance with data governance standards (GDPR, HIPAA, etc.
- Implement role-based access control and encrypted data storage.
- Work closely with Data Scientists, DevOps, Data Engineers, and Software Engineers.
- Provide mentorship and technical leadership to junior MLOps engineers or platform users.
- Translate business requirements into scalable and sustainable ML deployment solutions.
Must-Have Skills :
- Strong knowledge of CI/CD, model lifecycle management, and DevOps for ML
- Proficiency in Python, Bash, and automation scripting
- Hands-on experience with Docker, Kubernetes, and Terraform/CloudFormation
- Experience with MLflow, Airflow, DVC, or equivalent tools
- Deep understanding of cloud platforms : AWS/GCP/Azure
- Experience working with ML frameworks like TensorFlow, PyTorch, Scikit-learn
- Familiarity with monitoring tools (Prometheus, Grafana, ELK, etc.)
- Solid understanding of data engineering principles and model versioning
Preferred Qualifications :
- Familiarity with Feature Stores (Feast, Tecton, etc.)
- Experience with distributed computing (Spark, Dask)
- Exposure to DataOps and Data Lake/Lakehouse architectures
- Knowledge of security protocols, OAuth, RBAC, and API Gateways
Did you find something suspicious?