Posted on: 16/01/2026
Key Responsibilities :
- Design and implement MLOps pipelines for training, validation, deployment, and monitoring of machine learning models.
- Develop and maintain infrastructure for data versioning, model registries, and experiment tracking (e.g., MLflow, LakeFS, Airflow).
- Integrate orchestration tools (e.g., Kubeflow, Ray, Airflow) to support automated workflows and distributed training.
- Collaborate with data scientists and software engineers to ensure seamless model handoff and deployment.
- Build APIs and SDKs to abstract infrastructure complexity and enable self-service model development.
- Implement monitoring and alerting systems for model drift, performance degradation, and system health.
- Support on-prem and cloud-based deployments (e.g., Kubernetes, HPC clusters, AWS).
Qualifications :
Required Qualifications :
- Bachelor's or Master's degree in Computer Science, Software Engineering, or related field.
- 3+ years of experience in software development, preferably in AI/ML infrastructure or data platforms.
- Proficiency in Python and/or TypeScript/JavaScript.
- Experience with backend frameworks (e.g., FastAPI, Flask, Node.js) and frontend libraries (e.g., React, Vue).
- Familiarity with cloud services (AWS preferred), containerization (Docker), and orchestration (Kubernetes).
- Strong understanding of RESTful APIs, CI/CD pipelines, and Git-based workflows.
Preferred Qualifications :
- Experience with distributed training frameworks (e.g., Ray, Ray Tune)
- Knowledge of model explainability, monitoring, and rollback strategies.
- Exposure to hybrid cloud/on-prem infrastructure and HPC environments.
- Prior work on internal platforms or developer tools.
Did you find something suspicious?
Posted by
Chetan
Senior HR Consultant – IT & Industrial at PROCIGMA BUSINESS SOLUTIONS PRIVATE LIMITED
Last Active: 21 Jan 2026
Posted in
Full Stack
Functional Area
Full-Stack Development
Job Code
1602212