About the Role :

We are looking for a highly motivated GenAI RAG (Retrieval-Augmented Generation) Engineer to join our AI team to design, develop, and optimize intelligent applications powered by LLMs and context-aware systems. In this role, you'll be responsible for implementing state-of-the-art RAG pipelines that combine large language models (LLMs) with dynamic data retrieval, enabling real-time, accurate, and contextually relevant outputs.

Key Responsibilities :

- Design and implement RAG pipelines that enhance LLM performance with external knowledge bases or proprietary datasets.

- Integrate vector databases (e.g., FAISS, Pinecone, Weaviate, ChromaDB) with LLMs for efficient and relevant document retrieval.

- Fine-tune or prompt-engineer LLMs (e.g., OpenAI, Claude, Mistral, LLaMA) to work in tandem with retrieval mechanisms.

- Build ETL and data preprocessing workflows to convert structured/unstructured data into embeddings and indexed formats.

- Optimize latency, relevance, and accuracy of the RAG systems for different use cases (e.g., customer support, enterprise search, knowledge management).

- Work with APIs and tools like LangChain, LlamaIndex, Haystack, or custom frameworks to orchestrate RAG workflows.

- Collaborate cross-functionally with data scientists, ML engineers, and product teams to deliver scalable GenAI solutions.

- Continuously evaluate and improve the system's performance using metrics such as recall, precision, and latency.

Required Skills and Qualifications :

- Bachelors or Masters in Computer Science, Data Science, AI/ML, or a related field.

- 2+ years of experience in AI/ML or NLP, with a strong focus on LLM-based systems.

- Proficiency in Python and popular ML/AI libraries (Transformers, PyTorch/TensorFlow, HuggingFace).

- Experience with vector search and embedding generation.

- Familiarity with retrieval methods (BM25, Dense Passage Retrieval, hybrid models).

- Strong understanding of LLMs and prompt engineering, including temperature, token limits, context windows, and fine-tuning.

- Experience working with cloud services (AWS, GCP, Azure) and deploying GenAI solutions in production environments.

- Strong problem-solving skills and ability to work in a fast-paced, research-oriented team.

Preferred Qualifications :

- Experience with open-source RAG frameworks (e.g., LangChain, LlamaIndex, Haystack).

- Exposure to multi-modal RAG (e.g., text + images, PDFs, tables).

- Knowledge of evaluation techniques specific to generative retrieval-based systems.

- Contributions to open-source GenAI projects or published work in NLP/RAG/LLM domains.