We are seeking an experienced and highly skilled Senior Data Scientist to join our team in Bengaluru. This role focuses on driving innovative, large-scale solutions using cutting-edge Classical Machine Learning, PySpark, Spark SQL, and Generative AI. The ideal candidate will possess a blend of deep technical expertise, strong business acumen, effective communication skills, and a sense of ownership. We require a proven track record in designing, developing, and real-time deploying scalable ML/DL pipelines and LLM Agents in a fast-paced, collaborative environment.

Responsibilities :

- Efficiently handle and model billions of data points using multi-cluster data processing frameworks (PySpark, Spark SQL).

- Expertise on Databricks is a must-have: Ability to design, write, scale, and monitorend-to-end ML Pipelines on Databricks.

- Proven expertise to run and manage Databricks data pipelines in real time for low-latency decision-making.

- Design and implement high-performance's using FastAPI in Python to expose real-time and batch ML pipelines.

- Design, implement, and deploy end-to-end ML/DL, GenAI solutions, writing modular, scalable, and production-ready code.

- Develop and implement scalable deployment pipelines using Docker and AWS services (ECR, Lambda, Step Functions).

- Design and implement custom models and loss functions to address data nuances and specific labelling challenges.

- Apply specialised modelling for marketing scenarios (Targeting, Budget optimisation, Churn) and data limitations (Sparse/incomplete labels, Single class learning).

- Leverage in-depth understanding of Transformer architectures and the principles of Large and Small Language Models.

- Practical experience in building LLM-ready Data Management layers for large-scale structured and unstructured data.

- Apply foundational understanding of LLM Agents and multi-agent systems (e. g., Agent-Critique, ReACT, Agent Collaboration), advanced prompting, LLM evaluation, confidence grading, and Human-in-the-Loop systems.

Requirements:

- Proficiency in Python and its data science ecosystem (NumPy, Pandas, Dask, PySpark) for large-scale data processing.

- Expert, hands-on experience with Databricks for MLOps, pipeline orchestration, and real-time deployment.

- Ability to perform effective feature engineering by understanding complex business objectives.

- In-depth knowledge of ANN, 1D/2D/3D Convolutional Neural Networks (ConvNets), LSTMs, and Transformer models.

- Strong proficiency in PU learning, single-class learning, and representation learning, alongside traditional ML approaches.

- Advanced understanding and application of model explainability techniques(e. g., SHAP, LIME).

- Hands-on experience with ML/DL libraries such as Scikit-learn, TensorFlow/Keras, and PyTorch.

- Experience utilising large-scale language models (GPT-4 Mistral, Llama, Claude) through prompt engineering and custom fine-tuning.

- Awareness of best software design practices and backend frameworks like Flask.

- Knowledge of Recommender Systems and advanced learning techniques (Representative learning, PU learning).

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Tanmay

Co-Founder at Dash Hire

Last Active: 31 Jan 2026

Job Views:
25

Applications: 28

Recruiter Actions: 0

Posted in

AI/ML

Functional Area

Data Science

Job Code

1599313

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers