HamburgerMenu
hirist

Western Digital - Principal Machine Learning Engineer - Large Language Models

Posted on: 13/12/2025

Job Description

Description :

Role : Principal ML Engineer

Location : Bengaluru, India (Bangalore Hallmark Office - IBS)


Work Mode : Hybrid


Business Function : Data Science


Employment Type : Full-time

Job Summary :

Western Digital is seeking a highly experienced and technically proficient Principal ML Engineer (Scientist 4, Data Science) to serve as a technical leader in developing and deploying cutting-edge Machine Learning solutions, with a strong focus on Large Language Models (LLMs), complex agentic systems, and high-performance, scalable ML infrastructure.

This role demands deep expertise in both theoretical ML foundations and practical, production-ready system design, driving innovation across various business units.

Key Responsibilities and Technical Focus :

- Lead the research, design, and implementation of advanced LLM-based applications, including fine-tuning proprietary models (e.g., using LoRA/QLoRA), optimizing inference pipelines, and ensuring model security and robustness.

- Architect and develop sophisticated AI Agent systems utilizing frameworks like Langchain or Langgraph, focusing on complex reasoning, tool use, memory management, and multi-step workflow automation.

- Drive Context Engineering initiatives, including advanced Retrieval-Augmented Generation (RAG) techniques, prompt optimization, data chunking strategies, and vector database integration (e.g., Pinecone, Milvus) for ground-truth synthesis.

- Design and implement end-to-end production-grade ML pipelines (MLOps), encompassing data ingestion, feature engineering, model training, continuous integration/continuous delivery (CI/CD), and monitoring/alerting systems.

- Spearhead the development of scalable and fault-tolerant ML infrastructure leveraging cloud services (AWS/GCP), containerization (Docker, Kubernetes), and workflow orchestration (e.g., Airflow, Kubeflow).

- Conduct complex data analysis and utilize advanced Data Structures and Algorithms to optimize model performance, computational efficiency, and resource utilization, especially for large-scale data processing.

- Collaborate closely with cross-functional engineering, product, and business teams to translate complex business problems into viable, high-impact ML solutions.

- Mentor junior data scientists and engineers on best practices in scalable ML engineering, model development lifecycle, and code quality.

Required Technical Skills :

- LLM Expertise : Deep, hands-on experience in the lifecycle management of LLMs, including specialized techniques for Fine-tuning (e.g., parameter-efficient methods), Agent Creation, and sophisticated Context Engineering / RAG implementations.

- Programming & ML Frameworks : Expert proficiency in Python for high-performance computing, along with deep working knowledge of major ML frameworks such as TensorFlow and/or PyTorch.

- LLM Tooling : Extensive experience with modern LLM orchestration and development frameworks, particularly Langchain and Langgraph.

- Core Foundations : Strong theoretical and practical mastery of Data Structures and Algorithms (DSA), complexity analysis, and object-oriented programming for building efficient, scalable systems.

- Cloud & MLOps : Proven track record with MLOps principles and tools for deploying and managing models in production. Experience with at least one major cloud provider (AWS/GCP/Azure).

Preferred Skills (Highly Relevant) :

- Database Technologies : Experience with graph databases like Neo4j for relationship modeling in complex RAG or agent memory systems, and proficiency with relational databases like Postgres (including extensions like pgvector).

- Cloud LLM Platforms : Practical experience implementing solutions on managed AI/ML platforms such as AWS Bedrock or Google Vertex AI.

- Performance Optimization : Knowledge of hardware acceleration and distributed computing techniques (e.g., Dask, Spark, NVIDIA GPUs) for training and inference of large models.

- Domain Knowledge : Previous experience applying ML/Data Science solutions within the semiconductor, storage, or manufacturing domains


info-icon

Did you find something suspicious?