Posted on: 08/12/2025
Description :
We are seeking an experienced and highly skilled Senior Data Scientist to join our team in Bengaluru. This role focuses on driving innovative, large-scale solutions using cutting-edge Classical Machine Learning, PySpark, Spark SQL, and Generative AI.
The ideal candidate will possess a blend of deep technical expertise, strong business acumen, effective communication skills, a sense of ownership & be motivated towards establishing quantifiable business impact. We require a proven track record in designing, developing, and real-time deploying scalable ML/DL pipelines and LLM Agents in a fast-paced, collaborative environment.
Responsibilities :
- Efficiently handle and model billions of data points using multi-cluster data processing frameworks (PySpark, Spark SQL).
- Develop and implement scalable deployment pipelines using Docker and AWS services (ECR, Lambda, Step Functions).
- Owning the entire workstreams end-to-end, from use-case identification, to initial designs and POC by building custom machine learning solutions as needed, till the business impact calculation of the use-case while ensuring modularity, scalability, and production-ready codebase.
- Design and implement custom models, loss functions and be able to handle nuanced conversations of trade-offs between various modelling choices.
- Apply specialised modelling for marketing scenarios (Targeting, Budget optimisation, Churn) and data limitations (Sparse/incomplete labels, Single class learning).
- Practical experience in building LLM-ready Data Management layers for large-scale structured and unstructured data.
- Apply foundational understanding of LLM Agents and multi-agent systems (e. g., Agent-Critique, ReACT, Agent Collaboration), advanced prompting, LLM evaluation, confidence grading, and Human-in-the-Loop systems.
- Team Mentorship and Stakeholder Management.
- Mentor, support and manage a cross-functional team.
- Bring in structure across the client engagement - both internally as well as externally, with effective and top-down communication.
- Act as the primary contact for clients, translating complex data needs into tasks.
- Present data insights to stakeholders, highlighting business impacts.
- Collaborate with cross-functional teams to align AI initiatives with business goals.
Requirements :
- Expertise on Databricks/AWS is a must-have : Ability to design, write, scale, and monitor end-to-end ML Pipelines on Databricks/AWS.
- Proven expertise to run and manage Databricks data pipelines in real time for low-latency decision-making.
- Proficiency in Python and its data science ecosystem (NumPy, Pandas, Dask, PySpark) for large-scale data processing.
- Expert, hands-on experience with Databricks for MLOps, pipeline orchestration, and real-time deployment.
- Ability to perform effective feature engineering by understanding complex business objectives.
- In-depth knowledge of Classical ML : Tree-Based Models, GLMs, Clustering Models, etc.
- Deep Learning : ANN, 1D/2D/3D Convolutional Neural Networks (ConvNets), LSTMs, Transformer models.
- Strong proficiency in PU learning, single-class learning, and representation learning, alongside traditional ML approaches.
- Advanced understanding and application of model explainability techniques (e. g., SHAP, LIME).
- Hands-on experience with ML/DL libraries such as Scikit-learn, TensorFlow/Keras, and PyTorch.
- Experience utilising large-scale language models (GPT-4 Mistral, Llama, Claude) through prompt engineering and custom finetuning.
- Code Versioning Systems: Github, Git
- Communication Skills : Of all the things, this is perhaps the most important soft skill for us.
- You must be able to capture the attention of your audience - usually in client calls, succinctly put across your ideas to your team members, bring clarity of thought and next steps to the table and present it well.
- Presentation Skills : Be able to visually present your ideas on a white board. Be able to build compelling presentations for CxOs in a top-down manner with an angle of business impact in mind.
- Problem Solving Skills : Be able to leverage various internal tools, client datasets to craft a problem in the shortest time possible. Be able to make trade-offs, keeping the timelines in mind.
- Background in the Pharma Domain.
- Knowledge of Recommender Systems and Next Best Action Systems.
Did you find something suspicious?