HamburgerMenu
hirist

Aligned Automation - Data scientist - Machine Learning/Statistical Modeling

Posted on: 15/08/2025

Job Description

Role Overview :

As a Data Scientist, you will be at the forefront of designing, implementing, and optimizing Next Best Action (NBA) strategies using machine learning and state-of-the-art NLP techniques.

You will work closely with cross-functional teams including data engineers, domain SMEs, and business stakeholders to deliver actionable, data-driven insights and intelligent automation solutions.

Key Responsibilities :

- Build, deploy, and maintain machine learning models for predictive analytics, classification, clustering, and recommendation systems.

- Perform exploratory data analysis (EDA) on structured and unstructured datasets to uncover trends and behavioral patterns.

- Design and manage A/B testing frameworks to evaluate model performance and business impact.

- Develop pipelines for text vectorization and embeddings using Word2Vec, BERT, SBERT, or other transformer-based models.

- Implement Retrieval-Augmented Generation (RAG) workflows by integrating internal and external knowledge sources to enhance AI recommendations.

- Use frameworks like LangChain, Haystack, or custom code to build intelligent chatbot/assistant pipelines.

- Collaborate with data engineers to build data pipelines and ETL processes for model training and inference.

- Work with business teams to define Next Best Action strategies driven by ML/NLP outcomes.

- Present results and data-driven recommendations to technical and non-technical stakeholders.

- Continuously monitor and improve model performance through retraining, feedback loops, and new feature engineering.

Required Skills & Experience :

- Strong programming skills in Python and libraries such as pandas, NumPy, scikit-learn, seaborn.

- Experience with machine learning and statistical modeling techniques including linear regression, random forests, XGBoost, etc.

- Practical experience with NLP tasks text classification, Named Entity Recognition (NER), topic modeling, etc.

- Hands-on experience with embedding models like Word2Vec, BERT, SBERT, and transformer architectures.

- Knowledge of prompt engineering and working with LLMs (Large Language Models) such as OpenAI, Cohere, or similar.

- Proficiency in RAG (Retrieval-Augmented Generation) pipeline design and tools like LangChain, Haystack, or similar frameworks.

- Experience working with vector databases such as FAISS, Pinecone, Weaviate, or PostgreSQL with pgvector.

- Understanding of ETL/ELT pipelines, data transformation, and data cleaning techniques.

- Familiarity with SQL and data modeling.

- Exposure to cloud platforms such as AWS, GCP, or Azure


info-icon

Did you find something suspicious?