- As an MLOps Engineer II, you will play a key role in designing, building, and operating scalable and reliable machine learning platform and production inferencing.

- You will work closely with Data Scientists and Platform teams to operationalize end-to-end ML workflows on AWS, ensuring models move seamlessly from experimentation to production and monitoring.

- In this role, you are expected to operate with a high degree of ownership, contribute to architectural decisions, and mentor junior engineers and interns.

- You will also contribute to advanced initiatives such as Agentic AI systems and MCP servers, helping the team adopt emerging AI infrastructure patterns while maintaining strong MLOps fundamentals.

- Design, build, deploy, and maintain production-grade ML pipelines and workflows using AWS and Python, with a focus on reliability, scalability, and observability.

- Own and enhance the MLOps platform that automates the full ML model lifecycle from data annotation and training to inference, monitoring, and feedback loops.

- Collaborate closely with Data Scientists to productionize models, including packaging, versioning, deployment strategies, and performance optimization.

- Contribute to Agentic AI initiatives, including evaluation and deployment of MCP servers and related infrastructure components.

- Implement monitoring, logging, alerting, and CI/CD best practices for ML systems to ensure production stability and rapid issue resolution.

- Troubleshoot complex pipeline, infrastructure, and inference issues, performing root cause analysis and driving long-term fixes.

- Stay current with evolving MLOps practices, cloud-native ML tooling, and emerging AI infrastructure trends, and proactively introduce improvements.

- Participate in design reviews, technical discussions, and planning meetings; clearly communicate progress, risks, and trade-offs to stakeholders.

- Mentor interns and junior engineers by providing technical guidance, code reviews, and best practices.

- 3-6 years of hands-on experience building and operating ML or data platforms, with a strong focus on MLOps or ML infrastructure.

- Strong practical experience with AWS services such as Sagemaker, S3, EC2, Batch, Lambda, IAM, and monitoring tools.

- Proficiency in Python for building ML pipelines, automation, and infrastructure tooling.

- Solid understanding of the ML lifecycle, including training, evaluation, deployment, inference, and model monitoring.

- Experience with containerization (Docker) and familiarity with orchestration frameworks (e.g., Kubernetes or managed equivalents).

- Strong problem-solving skills and the ability to independently drive tasks in a fast-paced, evolving environment.

- Effective communication skills and experience collaborating across Data Science and Engineering teams.

Preferred Experience :

- Experience designing or operating end-to-end MLOps platforms supporting multiple models, teams, or use cases.

- Familiarity with CI/CD systems and Git-based workflows.

- Hands-on experience with ML inference systems (real-time or batch), including performance tuning and cost optimization.

- Exposure to or active work in Agentic AI, GenAI infrastructure, or MCP servers.

- Demonstrated ability to mentor junior engineers and raise overall team engineering quality.

- Strong aptitude for evaluating and adopting new technologies as AI and MLOps ecosystems evolve.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

M Arun Kumar

Recruiter Intern at Eagleview Solutions Private Limited

Last Active: 10 Mar 2026

Job Views:
211

Applications: 87

Recruiter Actions: 39

Posted in

DevOps / SRE

Functional Area

ML / DL / AI Research

Job Code

1615580

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers