Posted on: 05/12/2025
Description:
Role & Responsibilities
We are looking for a Senior MLOps Engineer with 8+ years of experience building and managing production-grade ML platforms and pipelines. The ideal candidate will have strong expertise across AWS, Airflow/MWAA, Apache Spark, Kubernetes (EKS), and automation of ML lifecycle workflows. You will work closely with data science, data engineering, and platform teams to operationalize and scale ML models in production.
Key Responsibilities:
- Design and manage cloud-native ML platforms supporting training, inference, and model lifecycle automation.
- Build ML/ETL pipelines using Apache Airflow / AWS MWAA and distributed data workflows using Apache Spark (EMR/Glue).
- Containerize and deploy ML workloads using Docker, EKS, ECS/Fargate, and Lambda.
- Develop CI/CT/CD pipelines integrating model validation, automated training, testing, and deployment.
- Implement ML observability: model drift, data drift, performance monitoring, and alerting using CloudWatch, Grafana, Prometheus.
- Ensure data governance, versioning, metadata tracking, reproducibility, and secure data pipelines.
- Collaborate with data scientists to productionize notebooks, experiments, and model deployments.
Ideal Candidate
- 8+ years in MLOps/DevOps with strong ML pipeline experience.
- Strong hands-on experience with AWS:
Compute/Orchestration: EKS, ECS, EC2, Lambda
Data: EMR, Glue, S3, Redshift, RDS, Athena, Kinesis
Workflow: MWAA/Airflow, Step Functions
Monitoring: CloudWatch, OpenSearch, Grafana
- Strong Python skills and familiarity with ML frameworks (TensorFlow/PyTorch/Scikit-learn).
- Expertise with Docker, Kubernetes, Git, CI/CD tools (GitHub Actions/Jenkins).
- Strong Linux, scripting, and troubleshooting skills.
- Experience enabling reproducible ML environments using Jupyter Hub and containerized development workflows.
Education:
- Masters degree in Computer Science, Machine Learning, Data Engineering, or related field.
Perks, Benefits and Work Culture
- Competitive Salary Package
- Generous Leave Policy
- Flexible Working Hours
- Performance-Based Bonuses
- Health Care Benefits
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
ML / DL Engineering
Job Code
1585672
Interview Questions for you
View All