Posted on: 07/09/2025
Job Description :
This individual contributor role reports to the Manager of the Global Factory IT group. We are looking for a sharp, driven, and autonomy-loving team member to join our IT team to shape and build efficient self-learning applications. The contractor will be involved in developing an AI Development and Model Training environment to support global AI Solutions development needs.
Key Responsibilities :
- Design and develop a global AI Training environment that dynamically allocates compute and GPU resources based on model training requirements.
- Integrate the training environment with the existing Factory MLOps platform for model tracking, cataloging, and deployment using tools such as MLFlow and KServe.
- Develop a common UI portal for Data Scientists and Communities of Practitioners (CoP) to access the environment, and provide APIs for integration with other factory systems.
- Collaborate with teams across different sites and departments to ensure the environment meets stakeholder requirements.
- Implement the environment across Seagates hybrid infrastructure, including AWS cloud and on-premises systems.
- Support departmental usage tracking and billing through a chargeback model.
- Provide technical support to users encountering issues with the environment.
Required Qualifications :
- Bachelors or Masters degree in Computer Science or a related field.
- Outstanding analytical and problem-solving skills.
- Familiarity with containerized environments, Kubernetes/Docker, and Rancher.
- Strong understanding of data structures, microservices application design, network protocols, publish-subscribe models, JSON.
- Proficiency in Python, VueJS, and web services design/development.
- Experience with virtual machines (VMs), containerized systems, and cloud infrastructure basics (AWS).
- Operating system experience in both Linux and Windows.
Preferred Qualifications :
- Proven experience in Linux system administration and containerized system development.
- Hands-on experience with messaging technologies such as RabbitMQ or Kafka.
- Understanding of GenAI, AI/ML solution architecture, and deployment in manufacturing environments.
- Excellent communication skills, stakeholder engagement, and team collaboration.
Did you find something suspicious?
Posted By
Ranjith Chandran
Delivery Manager at Lavu Tech Solutions Sdn Bhd
Last Active: NA as recruiter has posted this job through third party tool.
Posted in
DevOps / SRE
Functional Area
ML / DL / AI Research
Job Code
1542220
Interview Questions for you
View All