HamburgerMenu
hirist

Carelon - Senior Data Scientist - NLP/LLM

Carelon
5 - 10 Years
Multiple Locations

Posted on: 28/04/2026

Job Description

JOB POSITION :


We are currently looking to hire a Senior Data Scientist with strong analytical skills and a background in US Healthcare. The ideal candidate should have :


- A minimum of 5+ years of overall experience in Data science or related fields


- At least 3 years of hands-on experience in Machine Learning (ML) and Natural Language Processing (NLP)


- Candidates with proven expertise in healthcare data analytics and a solid understanding of healthcare systems in the US will be preferred


JOB RESPONSIBILITY :


Key Responsibilities :


- Demonstrate expertise in programming with a strong background in machine learning and data processing.


- Possess strong analytical skills to interpret complex healthcare datasets and derive actionable insights.


- Collaborate closely with AI/ML engineers, data scientists, and product teams to acquire and process data, debug issues, and enhance ML models.


- Develop and maintain enterprise-grade data pipelines to support state-of-the-art AI/ML models.


- Work with diverse data types including structured, semi-structured, and textual data.


- Communicate effectively and collaborate with cross-functional teams including engineering, product, and customer stakeholders.


- Operate independently with minimal guidance from product managers and architects, demonstrating strong decision-making capabilities.


- Embrace complex problems and deliver intelligence-driven solutions with a focus on innovation and scalability.


- Quickly understand product requirements and adapt to evolving business needs and technical environments.


Technical Responsibilities :


- Design and implement statistical and machine learning models (e.g., regression, classification, clustering) using frameworks such as scikit-learn, TensorFlow, and PyTorch.


- Build robust data preprocessing pipelines to handle missing values, outliers, feature scaling, and dimensionality reduction.


- Specialize in Large Language Model (LLM) development, including fine-tuning, prompt engineering, and embedding optimization using frameworks like Hugging Face Transformers.


- Develop and optimize LLM evaluation frameworks using metrics such as ROUGE, BLEU, and custom human-aligned evaluation techniques.


- Apply advanced statistical methods including hypothesis testing, confidence intervals, and experimental design to extract insights from complex datasets.


- Create NLP solutions for text classification, sentiment analysis, and topic modeling using both classical and deep learning approaches.


- Design and execute A/B testing strategies, including sample size determination, metric selection, and statistical analysis (e.g., t-tests, ANOVA).


- Implement comprehensive data visualization strategies using tools like Matplotlib, Seaborn, and Plotly to present insights effectively.


- Maintain detailed documentation of model architectures, experiments, and validation results using tools like MLflow or DVC.


- Research and apply LLM optimization techniques such as quantization, pruning, and knowledge distillation to improve efficiency.


- Stay up to date with the latest advancements in statistical learning, deep learning, and LLM research, with a focus on emerging architectures and training methodologies


QUALIFICATION :


- Bachelors or masters degree in computer science, Mathematics or Statistics, Computational linguistics, Engineering, or a related field. Ph.D. preferred.


EXPERIENCE :


- 5+ years of overall professional experience in data science, analytics, or related fields.


- 3+ years of hands-on experience working with large-scale structured and unstructured data to develop data-driven insights and solutions using Machine Learning (ML), Natural Language Processing (NLP), and Computer Vision.


- Proven 3+ years of experience with core technologies including Python (mandatory), SQL, Hugging Face, TensorFlow, Keras, PyTorch, and Apache Spark.


- 3+ years of experience in developing NLP models, with a strong focus on transformer-based architectures.


- 2+ years of experience implementing information retrieval systems at scale, including both keyword-based and semantic search using embeddings.


- Hands-on experience with cloud platforms such as Google Cloud Platform (GCP) and Amazon Web Services (AWS).


- Strong expertise in Large Language Models (LLMs) and Generative AI (GAI), including model development, fine-tuning, and optimization.


- Demonstrated ability to work independently with minimal supervision and exercise sound judgment in technical and business decision-making.


- In-depth experience with LLMs (both extractive and generative), including prompt engineering, fine-tuning, and familiarity with open-source ecosystems.


- Experience in prompt development and optimization for NLP applications.


- Strategic thinker with a blend of technical expertise and business acumen, capable of solving complex problems and influencing outcomes.


- Proficient in creating analytical reports, projections, models, and presentations to support business objectives.


- Excellent written and verbal communication skills, with strong stakeholder management capabilities.


- Prior experience in the healthcare industry, with an understanding of domain-specific data and regulatory considerations.


SKILLS AND COMPETENCIES :


- Must have : Machine Learning, LLM, NLP, Python, SQL, Hugging Face, TensorFlow & Keras.


- Good to have : PyTorch, Spark & any cloud exp.

info-icon

Did you find something suspicious?

Similar jobs that you might be interested in