HamburgerMenu
hirist

Senior Data Scientist - Artificial Intelligence/Machine Learning

Recruiting Bond
4 - 7 Years
Bangalore

Posted on: 10/04/2026

Job Description

Description :

Senior Data Scientist - AI/ML/NLP

- 47 years

- Production Model Development Feature Engineering Applied NLP Experimentation Rigour

The Role :

- Senior Data Scientists build models that users never see but always feel.

- The search result that surfaced the right hotel.

- The price alert that fired at exactly the right moment.

- The voice assistant that understood 'Bengaluru to Chennai kal ke liye' and returned the correct train options.

- These are the outputs of the work you will do here.

- Own 12 ML models from inception through production you are accountable for their health

- Write production-quality Python: not notebooks, but tested, versioned, deployable ML code

- Your evaluation is airtight: offline and online metrics are aligned, not just correlated

- You leave documentation and monitoring that makes your models maintainable, not just functional

Core Responsibilities :

ML Model Development :

- Build supervised and unsupervised ML models for classification, ranking, regression, and recommendation using the best-fit algorithm for the problem

- Implement feature engineering pipelines: batch (Spark/dbt) and near-real-time (Kafka Streams) with data quality checks, schema validation, and lineage documentation

- Train models with rigorous evaluation: stratified splits, time-based validation for temporal data, calibration analysis for probability outputs

- Deploy models to staging and production: shadow mode testing, gradual rollout, latency profiling, and memory footprint analysis

AI-First Contributions :

- Ranking: implement and evaluate L2R baseline models; contribute features and evaluation metrics to the central ranking pipeline

- Recommendation: build collaborative filtering models for a specific vertical; own the offline evaluation pipeline and cold-start testing

- NLP: train text classifiers and entity extractors for query understanding; evaluate multilingual performance on Hinglish test sets

- LLM support: contribute to RAG pipeline evaluation build test datasets, measure retrieval quality (MRR, Recall@K), report hallucination rates

- Price ML: build fare prediction baseline models; produce calibrated probability outputs for price alert triggers

- Sentiment: train review classification models; build review summarisation evaluation benchmarks for your vertical

- Fraud ML: build transaction anomaly features; evaluate model precision/recall trade-offs for fraud scoring in booking flows

Experimentation & Evaluation :

- Run A/B experiments: design with power analysis, monitor for SRM, apply CUPED, and write clear result summaries

- Build and maintain model monitoring dashboards: performance drift, feature distribution shift, business metric correlation

- Document your models: training procedure, feature definitions, known failure modes, and retraining schedule

What 'Senior' Actually Means Here :

- You design before you code an ADR or sequence diagram, written before the first PR

- Your models have monitoring from day one you define what 'degraded' looks like before you deploy

- You handle the edge cases: what happens when a feature is missing? when a model returns NaN? when the upstream API is down?

- Your code review comments are educational: you explain the why, not just the what

- You raise a concern about a flawed experiment design before the experiment runs, not after it finishes

The AI-First Mandate :

- AI is not an enhancement.

- It is the product architecture.

- Every surface, every API, every decision point is either ML-powered today or on the roadmap to be.

- Search & Ranking Learning-to-Rank across flights, hotels, bus routes, train coaches; real-time re-ranking on user signals

- Voice AI Hindi/Hinglish voice booking, intent resolution, spoken fare comparisons, accessibility-first conversational UX

- RAG Systems Fare rule retrieval, hotel cancellation policy Q&A, airline contract intelligence, real-time regulatory updates

- Agentic AI Autonomous booking resolution, exception handling, refund orchestration, supplier communication bots

- MCP Orchestration Model Context Protocol tool chains across GDS APIs, payment gateways, and supplier integrations

- Recommendation Engine Cross-vertical next-best-action, collaborative filtering, session-based deep learning

- Price Intelligence Competitive fare mapping, lower-price guarantee engine, demand elasticity, yield optimisation

- Coupon & Promo ML Personalised offer targeting, redemption probability scoring, margin-aware discount optimisation

Sentiment & Review AI Review summarisation, NPS prediction, complaint triage, trust signal extraction

- Fraud & Risk ML Anomaly detection, account takeover signals, payment fraud scoring, fake review classification

- Deep System Mapping Route intelligence, geo-semantic matching, multi-modal journey planning

- Predictive Systems Cancellation risk, no-show prediction, seat upgrade probability, waitlist conversion

Who You Are :

- 47 years in Data Science/ML with at least 1 production model shipped, monitored, and iterated on

- Strong Python and SQL; comfortable with pandas, Scikit-learn, PyTorch, and at least one NLP library

- Working NLP knowledge: text classification, embeddings, or transformer fine-tuning LLM curiosity is expected

- Statistical instinct: you do not celebrate p=0.049 without checking for SRM, multiple comparisons, and practical significance

- Degree in CS, Statistics, Mathematics, or a quantitative engineering field

Technology Stack :


ML : Scikit-learn XGBoost LightGBM PyTorch SHAP

NLP : HuggingFace Transformers spaCy Sentence Transformers LangChain (basics)

Data : Python Pandas SQL Spark (basics) Kafka (awareness)

Platform : MLflow Airflow Feast (basics) Docker FastAPI


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in