Posted on: 10/08/2025
Senior Data Scientist Shiprocket
Job Overview :
Shiprocket is looking for a highly skilled and experienced Senior Data Scientist to join our dynamic team. As a Senior Data Scientist, you will play a critical role in leveraging data to drive insights and solutions that enhance our logistics platform. You will be responsible for leading data-driven projects, developing predictive models, and working closely with cross-functional teams to optimize operations and improve customer experiences.
In this role, you will also build and scale large-scale machine learning systems, work on GenAI applications including LLMs and RAG pipelines, and lead efforts in fine-tuning models (LoRA, QLoRA, PEFT), MLOps productionization, and vector database integration for real-time performance.
Responsibilities :
- Lead Data Science Projects: Oversee the end-to-end execution of data science projects, from data collection and cleaning to model development, validation, and deployment.
- Predictive Modeling: Develop and implement advanced predictive models to solve complex business problems and drive strategic decision-making.
- Data Analysis: Conduct deep-dive analyses to uncover actionable insights and trends that inform business strategies and operations.
- Collaboration: Work closely with product managers, engineers, and other stakeholders to integrate data science solutions into our products and services.
- Innovation: Stay abreast of the latest developments in data science and machine learning, and apply innovative techniques to improve our data capabilities.
- Mentorship: Mentor junior data scientists and data analysts, providing guidance and support to help them grow their skills and contribute effectively to the team.
- Optimization: Continuously monitor and optimize models and algorithms to ensure they remain effective and relevant in a changing business environment.
- ML at Scale: Design and implement large-scale distributed ML systems, including parallel training/inference pipelines across millions of users and transactions.
- LLMs & RAG Pipelines: Build and deploy Retrieval-Augmented Generation pipelines using large language models with custom embedding and retrieval strategies.
- Model Fine-Tuning: Apply techniques such as LoRA, QLoRA, and PEFT for adapting foundation models to domain-specific tasks (e.g., address parsing, fraud scoring).
- Vector Databases: Integrate and optimize vector DBs like FAISS, pgvector, or Milvus for semantic search, retrieval, and matching in LLM workflows.
- MLOps Productionization: Own end-to-end deployment, monitoring, and lifecycle management of ML models using tools like SageMaker, Docker, Airflow, MLflow, or KubeFlow.
Skills and Qualifications :
Education :
- Bachelors, Masters, or Ph.D. in Data Science, Computer Science, Statistics, Mathematics, or a related field.
Experience :
- Minimum of 3 years of experience in data science, with a proven track record of leading successful data-driven projects. (48 years preferred)
Technical Skills :
- Proficiency in programming languages such as Python, Shell Scripting, and SQL.
- Strong experience with machine learning frameworks (e.g., TensorFlow, PyTorch, scikit-learn).
- Experience with big data technologies such as Spark, Hadoop is a plus.
- Experience with AWS and cloud-based ML deployment solutions (e.g., SageMaker, Batch, Lambda).
- Expertise in building and managing end-to-end ML pipelines and ETL processes.
- Experience with large language models (LLMs) and embeddings for downstream applications.
- Experience in RAG architecture: chunking, vectorization, retrieval, prompt orchestration.
- Familiarity with vector search engines like FAISS, pgvector, or Pinecone.
- Hands-on with fine-tuning techniques: LoRA, QLoRA, PEFT, quantization, and distillation.
- Understanding of model observability, drift detection, model versioning, and CI/CD for ML.
Adaptability :
- Ability to work in a fast-paced, dynamic environment and manage multiple projects simultaneously.
#NOLI
Did you find something suspicious?