We are looking for a skilled AI/ML Engineer to help design and implement GenAI-based systems that interface with real-time enterprise data. You will be responsible for developing, fine-tuning, orchestrating, and integrating LLM-powered capabilities such as retrieval-augmented generation (RAG), function/tool calling, and data-grounded Q&A, within the Azure OpenAI ecosystem.

The ideal candidate brings hands-on experience with LLM orchestration frameworks, prompt engineering, embedding models, and integrating AI systems into production-grade Azure-based platforms.

Core Responsibilities :

LLM System Development :

Design and implement LLM-based pipelines, including :

- Prompt engineering

- Few-shot and zero-shot techniques

- Function/tool calling

- Chain-of-thought and structured output generation

- Work with Azure OpenAI, GPT-4, and embedding models for various use cases

- Build conversational flows, decision trees, and fallback logic for copilots or assistants

Retrieval-Augmented Generation (RAG) :

Develop and optimize RAG pipelines :

- Create embedding pipelines (e.g., using text-embedding-ada-002, Cohere, or Sentence Transformers)

- Chunk and index content from structured and unstructured sources (PDFs, Office files, HTML, etc.)

- Store and retrieve embeddings using Azure AI Search, FAISS, or Weaviate

- Evaluate grounding accuracy and relevance scoring

Machine Learning Models :

- Build, train, and fine-tune time series forecasting models (e.g., XGBoost, Prophet, ARIMA, or LSTM) for financial KPIs where GenAI requires predictive context

- Combine structured model outputs with LLM reasoning (e.g., forecasts + narrative insights)

Tool/Function Integration :

- Integrate structured data APIs, SQL endpoints, Power BI connectors, and OLAP cube access as tools/functions callable by the LLM

- Design input/output schemas for safe and deterministic API usage by the model

- Support plugin-style orchestration (LangChain/Function Calling/Semantic Kernel)

Evaluation & Iteration :

Define custom evaluation frameworks using metrics like :

- Hallucination rate

- Grounding precision/recall

- Prompt latency and token efficiency

- Set up experiment tracking using tools like MLflow, Weights & Biases, or PromptLayer

- Maintain few-shot/test prompt sets and continuously refine

Required Skills and Experience :

- 4- 6+ years of experience in AI/ML/NLP engineering

- Deep familiarity with LLM systems : prompt tuning, orchestration, and fine-tuning

- Hands-on experience with :

1. Azure OpenAI Service

2. LangChain, Semantic Kernel, or similar orchestration tools

3. Vector databases (Azure AI Search, FAISS, Pinecone)

4. Embedding model APIs (OpenAI, HuggingFace, Cohere, etc.)

- Strong understanding of time series modeling and ML forecasting techniques in financial domains (e.g., cost, margin, working capital, price volatility)

- Strong proficiency in Python, with experience in developing modular, testable code for AI/ML pipelines, API integrations, and backend services

- Experience building and deploying backend components (e.g. FastAPI, Flask) to serve AI models or integrate with retrieval pipelines

- Familiarity with best practices for production-grade AI applications, including logging, monitoring, and containerisation (e.g. Docker)

- Ability to work across the full stack of an AI system from model development to integration and inference APIs

- Experience in building chatbots or copilots in enterprise settings

- Knowledge of Azure cloud services, esp. Functions, App Services, Blob Storage, and Key Vault

- Familiarity with enterprise systems like Power BI, SAP, or OLAP cubes