Senior Cloud & ML Infrastructure Engineer

ORBION INFOTECH

Multiple Locations

7 - 10 Years

2+ Reviews

MLOps Machine Learning AWS Google Cloud Platform Azure Docker Kubernetes CI/CD Monitoring Tools Tensorflow Python

Posted on: 10/09/2025

Job Description

Key Responsibilities :

- Architect, deploy, and manage scalable ML infrastructure on cloud platforms (AWS, GCP, or Azure).

- Design and maintain end-to-end ML pipelines for training, testing, and deploying models.

- Work with Kubernetes, Docker, and CI/CD to automate ML workflows and deployments.

- Collaborate with data scientists to optimize model training and inference performance.

- Implement monitoring, logging, and alerting systems for ML applications in production.

- Ensure data security, compliance, and cost optimization in cloud environments.

- Integrate distributed computing frameworks (Spark, Ray, Dask, etc.) for large-scale data processing.

- Research and adopt best practices for MLOps and cutting-edge ML infrastructure technologies.

Requirements :

- Bachelors/Masters degree in Computer Science, Engineering, or related field.

- 7+ years of experience in cloud engineering, DevOps, or ML infrastructure roles.

- Strong expertise in cloud platforms (AWS, GCP, Azure) including compute, storage, and networking.

- Hands-on experience with Kubernetes, Docker, and Terraform for infrastructure automation.

- Solid knowledge of ML frameworks (TensorFlow, PyTorch) and MLOps tools (Kubeflow, MLflow, SageMaker, Vertex AI, etc.).

- Strong programming skills in Python, Go, or Java.

- Experience with distributed training, GPU acceleration, and model deployment at scale.

- Knowledge of CI/CD pipelines, monitoring tools (Prometheus, Grafana), and logging systems.

- Strong problem-solving, communication, and leadership skills.