We are seeking a highly skilled AI Specialist with expertise in Large Language Models (LLMs) to design, build, and optimize AI-driven solutions.

You will work on cutting-edge model development, fine-tuning, evaluation, and deployment of LLM-powered systems that solve real-world business problems.

This role requires deep technical experience with modern AI architectures, prompt engineering, and scalable ML infrastructure.

Key Responsibilities :

- Develop, fine-tune, and optimize LLMs using frameworks such as PyTorch, TensorFlow, JAX, or Hugging Face Transformers.

- Build production-grade pipelines for training, inference, evaluation, and monitoring of LLM-based systems.

- Design and implement Retrieval-Augmented Generation (RAG) solutions, vector stores, embeddings, and hybrid search.

- Create robust prompt engineering strategies, system instructions, and guardrails for safety and reliability.

- Architect scalable ML infrastructure using GPUs, distributed training, and MLOps tools.

- Evaluate models using metrics such as perplexity, accuracy, hallucination detection, and safety scoring.

- Collaborate with cross-functional teams (Data Engineering, Product, Backend) to integrate AI into applications.

- Research new LLM architectures, fine-tuning techniques (LoRA, QLoRA, PEFT), and model compression methods.

- Ensure data privacy, model governance, and compliance with AI safety and ethical standards.

- Troubleshoot complex model behavior, optimize inference latency, and reduce compute cost.

Required Qualifications :

- Bachelors/Masters/PhD degree in Computer Science, AI/ML, Data Science, or a related field.

- 57+ years of experience in machine learning, with at least 12 years working specifically on LLMs.

- Strong proficiency with Python, deep learning frameworks (PyTorch preferred), and transformer architectures.

- Hands-on experience with LLM fine-tuning, prompt engineering, and evaluation.

- Understanding of RAG pipelines, vector databases (FAISS, Pinecone, Weaviate, Chroma), and embeddings.

- Proficiency with ML Ops tools : MLflow, Weights & Biases, Kubeflow, Ray, or similar.

- Experience deploying models on cloud platforms (AWS Sagemaker, GCP Vertex AI, Azure ML) or GPU clusters.

- Solid understanding of NLP techniques, tokenization, attention mechanisms, and model optimization