Posted on: 10/08/2025
Job Description
Responsibilities :
- Manage and optimize end-to-end MLOps pipelines for data collection model training, validation, and monitoring while ensuring team collaboration and effective resource allocation.
- Drive the implementation of model compression, quantization, and distributed training techniques to enhance performance, encouraging innovative solutions from team members.
- Track key metrics and optimize deployed models to ensure ongoing effectiveness, collaborating with team members to identify improvement opportunities.
- Collaborate with cloud architects and DevOps teams to design and maintain scalable ML infrastructure, ensuring effective resource management and deployment.
- Work closely with applied scientists and analysts to transform model requirements into production-ready solutions, facilitating teamwork across departments.
- Establish and maintain monitoring and alerting systems for deployed models, ensuring prompt issue resolution while guiding the team in best practices.
- Create and uphold documentation for ML architecture and best practices to ensure knowledge sharing within the team, promoting continuous improvement.
- Stay current with advancements in ML technologies and lead ongoing enhancement initiatives within the team, encouraging team participation in the ML community.
Requirements :
- Bachelor's/Master's / PhD in Computer Science or related field.
- 5 years of experience in machine learning with a strong portfolio of deployed ML models for various use cases, including batch streaming and real-time.
- Proficient in Python for model development and data manipulation with experience in Java or Scala for building production systems.
- Familiarity with messaging queues (e. g. Kafka SQS) and MLOps tools (e. g. MLflow, Kubeflow, Airflow).
- Experience with cloud platforms (AWS, Google Cloud, Azure) and containerization technologies (Docker, Kubernetes).
- Knowledge of machine learning frameworks (e. g. TensorFlow, PyTorch) and databases (e. g. Elasticsearch, MongoDB, PostgreSQL).
- Understanding of data processing and ETL tools (e. g. Apache Spark, Kafka).
- Experience with monitoring tools like Grafana and Prometheus.
- Strong problem-solving skills and an analytical mindset.
Did you find something suspicious?