Posted on: 13/12/2025
Description :
We are seeking an experienced MLOps Engineer with strong Databricks expertise to build, scale, and operationalize our machine learning ecosystem.
This role demands deep hands-on experience with MLFlow, CI/CD automation, Databricks administration, and workflow orchestration, ensuring production-grade reliability, governance, and performance of ML solutions.
The ideal candidate combines software engineering rigor with a strong understanding of machine learning lifecycle management, distributed compute, and cloud-native MLOps architectures.
Key Responsibilities :
MLOps & MLFlow Engineering :
- Collaborate closely with data scientists to productionize ML models, ensuring reproducibility and reliability.
- Build, automate, and maintain MLFlow pipelines for experiment tracking, model versioning, model registry, and deployment.
- Implement MLFlow tracking servers, model artifact repositories, and serving endpoints within Databricks.
- Manage model promotion processes across Dev, QA, and Production environments with strong governance and validation controls.
- Establish best practices for feature engineering consistency, model lineage, and reproducibility.
CI/CD & Automation :
- Design and maintain CI/CD pipelines for Databricks notebooks, workflows, MLFlow models, and data pipelines.
- Integrate build/deploy pipelines with tools such as Azure DevOps, GitHub Actions, or Jenkins.
- Enforce automated testing, linting, quality checks, and incremental deployment strategies.
- Ensure seamless code integration, versioning, and deployment across multiple environments.
Databricks Platform Administration :
- Manage Databricks clusters, pools, jobs, permissions, and workspace configurations.
- Optimize compute usage, cluster policies, and job execution cost-efficiency.
- Implement security, role-based access, token management, and workspace isolation practices.
- Collaborate with cloud teams for VNet integration, networking, and infrastructure hardening.
Notebook & Pipeline Development :
- Develop modular and reusable Databricks notebooks for data ingestion, processing, quality checks, and model training.
- Implement scalable pipeline patterns using PySpark, SQL, Delta Lake, and MLFlow.
- Enforce best practices for coding standards, exception handling, and logging.
Databricks Workflows & Orchestration :
- Design and manage Databricks Workflows for end-to-end orchestration of notebooks, DLT (Delta Live Tables), and ML pipelines.
- Implement workflow dependencies, retry logic, alerting, and SLA monitoring.
- Automate ML model deployment workflows, including batch scoring, streaming inference, and scheduled retraining.
Unity Catalog & Data Governance :
- Implement Unity Catalog governance for data, models, and notebooks.
- Configure access controls, catalogs, schemas, and lineage tracking across the platform.
- Ensure compliance with data security, audit, and organizational governance standards.
- Implement scalable permission models aligned with enterprise policies.
Collaboration, Documentation & Best Practices :
- Work closely with data engineers, data scientists, business teams, and cloud engineers.
- Create detailed documentation for workflows, operational runbooks, CI/CD pipelines, and platform configurations.
- Establish standards for version control, repository structures, environment management, and ML lifecycle processes.
Required Skills & Qualifications :
- 5+ years of experience in MLOps, ML platforms, or data engineering roles.
- Strong expertise in Databricks, MLFlow, Delta Lake, Databricks Workflows, and Unity Catalog.
- Hands-on experience deploying ML models in production using MLFlow or equivalent.
- Strong Python, PySpark, SQL, and distributed data processing experience.
- Deep understanding of CI/CD tools (Azure DevOps, GitHub Actions, Jenkins).
- Working knowledge of cloud platforms (Azure/AWS/GCP) and infrastructure fundamentals.
- Experience with monitoring, logging, and observability tools.
- Ability to troubleshoot complex ML pipelines and distributed computing issues
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1589683
Interview Questions for you
View All