HamburgerMenu
hirist

Job Description

Description :

Position Title : ML Ops Engineer 4.

Complete onsite.

Full-Time role.

Shift Timings : Regular.

Address : Spire T110, Hyderabad Knowledge City, Madhapur, Hyderabad, Telangana, India, 500081.

Job Description :

Roles & Responsibilities :

- Define the long-term vision and strategy for MLOps initiatives : Set the direction for the organizations MLOps, model deployment, and monitoring practices.

- Lead and manage a team of MLOps engineers : Provide technical guidance, mentorship, and career development for team members.

- Identify and explore cutting-edge research areas and technologies : Stay abreast of the latest advancements in MLOps, model serving, and AI operations.

- Drive innovation and the development of novel MLOps solutions : Lead efforts, prototype new approaches, and oversee implementation of advanced MLOps platforms.

- Design and manage scalable ML infrastructure and pipelines on GCP; oversee model deployment (A/B testing, rollouts/rollbacks, auto-scaling), and establish monitoring/observability (performance, drift, KPIs).

- Ensure ML operations meet governance, security, compliance, and disaster recovery standards across the organization.

- Collaborate with executive leadership on strategic decision-making : Align MLOps initiatives with business objectives and organizational priorities.

- Establish and enforce MLOps standards and best practices : Ensure quality, reproducibility, and security of ML systems across the organization.

- Represent the organization in external MLOps communities : Speak at conferences, publish thought leadership, and build partnerships with academia and industry.

Technical Skills :

- 12+ years of experience.

- Mastery of relevant technical skills : Deep expertise in MLOps, model deployment, monitoring, and governance.

- Significant experience in designing and implementing complex MLOps systems at scale : Lead the architecture and deployment of large-scale MLOps platforms on GCP.

- Hands-on experience architecting large-scale ML platforms on GCP (Vertex AI, GKE, Dataflow, Big Query, Pub/Sub, Cloud Composer), implementing experiment tracking (MLflow, Weights & Biases, TensorBoard), feature stores (Vertex AI), data pipelines and workflow orchestration, and ensuring cloud security, compliance, disaster recovery, and cost optimization.

- Strong leadership and team management skills : Build, mentor, and lead high-performing MLOps teams.

- Excellent strategic thinking and problem-solving abilities : Translate business challenges into scalable, reliable MLOps solutions.

- Exceptional communication and influencing skills : Advocate for MLOps initiatives, and influence executive decisions and represent the organization externally through conferences, publications, and industry engagement.

Must Have Skills :

- Deep expertise in MLOps, model deployment, monitoring, and governance.

- Experience building scalable MLOps platforms on GCP.

- Proficiency with CI/CD for ML, containerization (e.g. Docker, Kubernetes), IaC (Terraform), and orchestration.

- Leadership in MLOps strategy, standards, and cross-team collaboration.

- Hands-on expertise with GCP ML and data services (Vertex AI, Dataflow, BigQuery, Pub/Sub, Cloud Composer, GKE).

- Experience implementing model observability (performance monitoring, drift detection, dashboards, and alerts).

- Proficiency with experiment tracking (MLflow, W&B) and feature store management.

- Knowledge of cloud security, compliance, and cost optimization strategies.


info-icon

Did you find something suspicious?