HamburgerMenu
hirist

Senior Data Scientist - Senior Data Scientist Models

STAFFINGTON CONSULTING PRIVATE LIMITED
Hyderabad
5 - 8 Years

Posted on: 16/07/2025

Job Description

Responsibilities :

- Design, implement, and optimize LLM pipelines for NLP tasks such as summarization, classification, and entity recognition.

- Build and deploy AI models to production with scalable, testable code.

- Develop and fine-tune models using domain-specific datasets, integrating them into live systems.

- Create retrieval-augmented systems using vector stores and embeddings for contextual Q&A.

- Support ASR-based NLP systems with diarization, tagging, and post-processing.

- Work closely with senior leadership to ensure technical alignment with product goals.

- Contribute to AI/ML best practices and help standardize development workflows across teams.


Primary Skills (Must have) :


- Strong Python development with OOP principles and ML/AI best practices.

- Expertise in machine learning, deep learning, and NLP using libraries like scikit-learn, PyTorch, TensorFlow.

- Practical experience with LLM fine-tuning, LoRA, PEFT, and embedding-based models.

- Experience building RAG pipelines for question-answering, search, and knowledge summarization.

- Hands-on with vector stores (FAISS, Pinecone, ChromaDB), transformers, and Hugging Face models.

- Experience deploying models via FastAPI, Flask, and Docker.

- Good knowledge of speech-to-text tools like Whisper and AWS/GCP STT for transcription.

- Familiarity with prompt engineering, LangChain, and retriever models.

- Experience with cloud ML platforms (AWS, Azure, GCP) for model training and inference.

- Exposure to MLOps practices like versioning, monitoring, and automated retraining.


Secondary Skills :


- Domain knowledge in financial services, market places, healthcare, pharma and life sciences NLP use cases.

- Experience with knowledge graphs, graph neural networks, and embeddings-based reasoning.

- Understanding of privacy-preserving ML (e.g., differential privacy, federated learning).

- Exposure to LangChain, LlamaIndex, Haystack, or other GenAI orchestration tools.


info-icon

Did you find something suspicious?