Posted on: 17/12/2025
We are looking for a Senior MLOps Engineer to drive the design, deployment, and scalability of machine learning and LLM systems in production. This role focuses on end-to-end ML lifecycle management, production-grade ML infrastructure, and close collaboration with Data Science, Backend, and Cloud teams to ensure reliable, secure, and high-performing ML solutions.
Key Responsibilities :
MLOps & ML Lifecycle :
- Design, build, and maintain end-to-end MLOps pipelines covering model training, validation, deployment, monitoring, and retraining.
- Productionize machine learning and LLM models, ensuring reliability, scalability, and low latency.
- Implement model versioning, experiment tracking, and reproducibility.
- Monitor model performance, data drift, and system health in production environments.
ML Platform & Infrastructure :
- Build and maintain ML infrastructure on cloud platforms (Azure preferred, AWS acceptable).
- Deploy ML workloads using Docker and Kubernetes (EKS/AKS).
- Develop and manage CI/CD pipelines for ML workflows and model releases.
- Support high-availability, fault-tolerant ML systems in production.
Backend & Integration :
- Develop and maintain Python-based services and APIs (Flask/Django/FastAPI) to serve ML models.
- Integrate ML pipelines with data platforms, feature stores, and downstream systems.
- Work closely with Backend and Architecture teams to ensure seamless ML system integration.
Collaboration & Ownership :
- Partner with Data Scientists, DevOps, QA, Security, and Product teams to move models from research to production.
- Take ownership of design initiatives, from architecture decisions to execution.
- Implement security, data privacy, and governance best practices across ML systems.
Requirements :
- Must-Have 5+ years of experience in Python development, with significant exposure to MLOps or ML systems.
- Strong experience with ML deployment and production workflows.
- Hands-on expertise with Docker, Kubernetes, and CI/CD pipelines.
- Experience working with cloud platforms (Azure preferred, AWS acceptable).
- Solid understanding of SQL, relational databases, and performance optimization.
- Prior collaboration with Data Science teams to productionize ML models.
- Fluency in English (written and verbal).
Good to Have :
- Experience with ML libraries and data frameworks (scikit-learn, pandas, PySpark, PyArrow).
- Exposure to big data technologies (Snowflake, Spark).
- Familiarity with LLM workflows and inference optimization.
- Experience implementing monitoring, logging, and observability for ML systems.
Soft Skills :
- Strong problem-solving and analytical mindset.
- Proactive, independent, and ownership-driven. Excellent collaboration and communication skills.
- Adaptable and comfortable in fast-paced environments.
- Continuous learner with a passion for ML systems.
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
ML / DL Engineering
Job Code
1591881
Interview Questions for you
View All