MLOps Engineer

HyrEzy Talent Solutions

Bangalore

1 - 7 Years

4.7

5+ Reviews

MLOps CI/CD Pipeline Docker Kubernetes Monitoring Tools DevOps Python AWS Azure Google Cloud Platform

Posted on: 04/12/2025

Job Description

Details & Specification :

Experience : 1 - 6 Years

Location : "Bangalore, Mumbai, or Gurgaon (Hybrid / Work from Office)"

Mandatory Background : "B.Tech/M.Tech from IIT, NIT, BITS Pilani, or IIIT"

The Company : High-Growth B2B AI SaaS

About the Role :

We are seeking a proactive and skilled MLOps Engineer to serve as the critical link between our Data Science and Platform Engineering teams. You will be responsible for designing and building the automated infrastructure necessary to reliably train, version, deploy, and monitor our proprietary AI/ML models in a complex, multi-tenant enterprise environment. Your success directly ensures the continuous improvement and high availability of our core AI product features.

Key Technical Responsibilities :

1. MLOps Pipeline and Automation :

- CI/CD for ML : Design and implement robust CI/CD pipelines specifically for machine learning models (CI/CD/CT - Continuous Training) using tools like Kubeflow, MLflow, or DVC.

- Model Versioning : Establish a standardized system for model artifact management, version control, and lineage tracking to ensure reproducibility and auditability.

- Feature Store : Contribute to the design and implementation of a centralized Feature Store to enable feature reusability, consistency between training and serving, and low-latency feature retrieval.

2. Model Deployment and Serving :

- Containerization : Containerize ML models using Docker and orchestrate their deployment on Kubernetes clusters for scalable, real-time inference and batch predictions.

- A/B Testing Framework : Develop and manage a standardized framework for safely deploying, testing, and rolling back models in production (e.g., Canary deployments, Shadow mode).

- API Development : Work with Backend Engineers to define and optimize the APIs used for model serving, ensuring low-latency responses for mission-critical client workflows.

3. Monitoring and Governance :

- Model Monitoring : Implement proactive monitoring solutions to track model health, performance metrics (e.g., accuracy, latency), data drift, and model decay in production.

- Alerting : Set up comprehensive alerting on data quality issues or performance degradation, automating necessary retraining or mitigation workflows.

- Resource Optimization : Manage and optimize the cloud infrastructure costs associated with model training and serving, primarily on AWS or GCP.

What You'll Bring (Mandatory Skills & Experience) :

- Educational Excellence : B.Tech/M.Tech in Computer Science or a related discipline from an IIT, NIT, BITS Pilani, or IIIT is mandatory.

- Experience : 1-6 years of experience, with at least 1 years specifically focused on MLOps, DevOps, or building production-grade ML infrastructure.

- Core Tools : Strong practical experience with Kubernetes, Docker, Python, and MLOps platforms (Kubeflow/MLflow).

- Cloud & CI/CD : Proficiency with AWS (or GCP) and experience building CI/CD pipelines (Jenkins/GitLab/GitHub Actions).

- ML Fundamentals : Solid understanding of the ML lifecycle, model serving architectures, and data warehousing concepts.