HamburgerMenu
hirist

Data Scientist - Python/Machine Learning

Aliqan Services Private Limited
Remote
7 - 10 Years

Posted on: 08/07/2025

Job Description

Job Title : Data Scientist

Location : Remote

Experience : 7 years

Mode : 6 months contract + ext

Key Responsibilities :

Data Ingestion & Transformation :

- Develop scalable ETL/ELT pipelines using Azure Data Lake, Cosmos DB, and related tools.


- Ensure quality, consistency, and governance across ingested datasets.

Embeddings & Vector Search :

- Generate and manage vector embeddings using tools like Azure AI Search, FAISS, and LangChain.

- Implement semantic search workflows and fine-tune similarity search performance.

RAG Pipelines :

- Design and enable Retrieval-Augmented Generation (RAG) pipelines to enhance LLM outputs

with domain-specific knowledge.

- Integrate LLMs with structured and unstructured data sources.

Document Chunking & Metadata Tagging :

- Develop intelligent chunking strategies for unstructured content (PDFs, docs, web data, etc.).

- Enrich documents with metadata to enhance retrieval quality and LLM grounding.

Knowledge Base Integration :

- Integrate external and internal knowledge bases to support real-time retrieval and inference.

- Optimize query strategies for performance and relevance.

Performance Monitoring & Optimization :

- Measure and improve performance of embeddings, search latency, and LLM output quality.

- Collaborate with MLOps and DevOps to ensure scalable deployment.

Required Skills & Experience :


- 7 years of experience in data science, machine learning, or AI systems.


- Strong proficiency in Python, SQL, and libraries such as LangChain, FAISS, Hugging Face, etc.

- Hands-on experience with Azure Data Lake, Cosmos DB, and Azure AI Search.

- Familiarity with LLMs and building RAG pipelines in production.

- Solid understanding of vector databases, semantic similarity, and document embeddings.

- Experience with document processing : chunking, embedding generation, metadata tagging.

- Strong analytical and problem-solving skills with attention to system performance and

latency.

- Excellent communication skills and ability to work in cross-functional teams.


info-icon

Did you find something suspicious?