HamburgerMenu
hirist

DataWeave - Lead/Senior Data Scientist - LLM Models

Posted on: 22/12/2025

Job Description

Description :

About Us.

DataWeave is a SaaS-based digital commerce analytics platform that empowers retailers with competitive intelligence and equips consumer brands with digital shelf analytics globally.

Using proprietary AI technology, DataWeave analyzes over 500+ billion data points across 400,000+ brands, 4,000+ websites, and 20+ industry verticals.

Our clients include Nordstrom, Overstock, The Home Depot, Mars, Bush Brothers, Mondelez, Pernod Ricard, and more.

We are a globally distributed team of 220+ engineers, product managers, and eCommerce experts with technology offices in Bangalore.

What We Offer.

- Opportunities to work on cutting-edge AI research in NLP, Computer Vision, and Large Language Models (LLMs).

- Immediate impact on product and business decisions in the retail/eCommerce domain.

- End-to-end ownership of projects from ideation to deployment.

- Culture of openness, collaboration, and mentorship.

- Flexible work environment and continuous learning opportunities.

- Competitive rewards and fast-paced career growth.

Role Overview.

The Lead / Sr Data Scientist will drive AI innovation and solve complex business problems in the retail domain.

The role involves developing production-ready ML/DL models, leading research initiatives, mentoring junior team members, and ensuring AI solutions align with product strategy.

Responsibilities:

- Build robust ML models using state-of-the-art architectures for NLP, Computer Vision, and Deep Learning.

- Solve complex retail problems such as product matching, attribute extraction, and price optimization.

- Optimize models for scalability, efficiency, and deployment with MLOps best practices.

- Take end-to-end ownership of AI projects, from research to production.

- Mentor and guide junior team members, fostering a culture of innovation and collaboration.

- Collaborate with cross-functional teams to translate business problems into AI solutions.

Required Qualifications :

- Bachelors degree in Computer Science, Data Science, Mathematics, or a related field.

- 6+ years of hands-on experience in AI/ML development (3+ years acceptable with exceptional expertise in GenAI/LLMs).

- Expert-level Python proficiency with experience in PyTorch or TensorFlow.

- Strong experience in Generative AI, LLMs, vision-language models, and multimodal systems.

- Hands-on experience with NLP and CV libraries: SpaCy, NLTK, HuggingFace Transformers, OpenCV.

- Experience in model training, fine-tuning, quantization, evaluation, and deployment of transformer-based models (BERT, GPT, T5, LLaMA, etc.

- Familiarity with model optimization and scalability techniques (quantization, distillation, pruning, ONNX, TensorRT-LLM, DeepSpeed, etc.

- Strong understanding of LLM ecosystems including OpenAI, Anthropic, Meta, Google, Mistral, AWS Bedrock.

- Proven ability to lead projects and mentor teams in a high-velocity product environment.

Preferred / Good to Have :

- Masters or PhD in Computer Science, AI/ML, Applied Math, or related fields.

- Experience in startups or high-growth environments with ownership mindset.

- Building full MLOps pipelines (MLFlow, Kubeflow, Airflow, SageMaker, Vertex AI).

- LLM fine-tuning and parameter-efficient training (PEFT: LoRA, QLoRA, DoRA, Adapters, etc.

- Experience with LangChain, LangGraph, LlamaIndex, and multi-agent workflows.

- Building Retrieval-Augmented Generation (RAG) pipelines using vector DBs like Pinecone, Chroma, Qdrant, Weaviate, or FAISS.

- Practical experience in evaluating LLM applications using Ragas, DeepEval, Promptfoo, or custom frameworks.

- Knowledge of modern research in Transformer optimizations, self-supervised learning, agentic AI, and efficient training frameworks.

- Contributions to open-source ML/AI projects, publications, or active participation in research communities.


info-icon

Did you find something suspicious?