Education : Bachelor of Engineering / Bachelor of Technology (B.E. / B.Tech) in Computer Science, Artificial Intelligence, Data Science, or a related technical field.

Role Overview :

We are looking for an experienced LLM Engineer/AI Engineer to design, develop, and optimize cutting-edge Large Language Model (LLM) based systems that power intelligent automation and advanced AI agents. The candidate will play a key role in building scalable AI architectures capable of integrating with internal knowledge bases, customer systems, and trading platform data.

The ideal candidate should have hands-on experience working with state-of-the-art LLMs, Retrieval-Augmented Generation (RAG) architectures, prompt engineering frameworks, and agent-based AI systems. You will collaborate closely with data scientists, backend engineers, product teams, and DevOps teams to deploy robust AI-powered solutions in production environments.

This role involves working on high-performance AI infrastructure, optimizing model performance, latency, cost efficiency, and ensuring responsible AI practices through robust evaluation and monitoring frameworks.

Key Responsibilities :

1. Model Engineering :

- Design, implement, and optimize LLM-powered pipelines using both proprietary and open-source language models.

- Build scalable AI pipelines using models such as OpenAI GPT models, Anthropic Claude, Meta Llama, Mistral, and other open-source LLMs.

- Evaluate models based on performance, cost, latency, and accuracy to determine optimal deployment strategies.

- Implement model orchestration workflows for production-grade applications.

- Fine-tune or adapt models using parameter-efficient tuning techniques (LoRA, adapters, etc.) where applicable.

- Continuously research and integrate new advancements in LLM architectures and frameworks.

2. Retrieval-Augmented Generation (RAG) Architecture :

- Design and implement sophisticated RAG pipelines to ensure AI agents have access to relevant and up-to-date information.

- Develop RAG systems that integrate internal documentation, customer interaction history, and trading platform data.

- Implement vector databases and semantic search systems to support retrieval workflows.

- Optimize document chunking, embedding strategies, and retrieval ranking.

- Work with tools such as Pinecone, Weaviate, FAISS, Chroma, or similar vector databases.

- Ensure data security, compliance, and proper access controls within retrieval pipelines.

3. Prompt Engineering & Model Tuning :

- Develop advanced prompt strategies that maximize reasoning capability and output reliability.

- Design, test, and version-control complex prompt templates.

Implement techniques such as :

- Maintain prompt libraries and experiment frameworks.

- Continuously refine prompts to improve accuracy, reasoning ability, and context awareness.

- Work with frameworks such as LangChain, LlamaIndex, Semantic Kernel, or similar orchestration tools.

4. Evaluation & Testing :

- Build strong evaluation pipelines to ensure safe and reliable model performance.

- Develop LLM evaluation frameworks using automated metrics and human-in-the-loop feedback.

- Implement LLM-as-a-Judge systems for evaluating outputs.

- Measure key performance metrics such as:

- Create automated testing pipelines for pre-production validation.

- Establish benchmarking protocols for comparing models and system architectures.

5. Tool-Use Logic & Agent Orchestration :

- Develop logic that enables autonomous agents to interact with external systems.

- Implement agent frameworks that allow models to use external tools and APIs.

- Ensure safe execution, error handling, and access control for agent actions.

- Implement planning and reasoning architectures for multi-step task execution.

6. Latency Optimization & Performance Engineering :

- Ensure high-performance AI systems capable of responding in near real-time.

- Optimize inference pipelines to reduce response latency.

- Improve vector search speed and data retrieval performance.

- Implement caching strategies and request batching.

- Design scalable systems capable of handling high request volumes.

- Work closely with infrastructure teams to optimize GPU/CPU utilization.

7. DevOps for AI (LLMOps) :

- Support deployment, monitoring, and maintenance of AI systems in production environments.

- Implement LLMOps pipelines for continuous monitoring of model behavior.

- Track metrics such as model drift, hallucination trends, latency, and token usage.

- Set up logging, observability, and alerting systems.

- Collaborate with DevOps teams to ensure secure and scalable AI deployments.

- Use tools such as MLflow, Weights & Biases, LangSmith, Arize AI, or similar platforms.

Required Skills & Technical Expertise :

- Strong experience with Python programming

Experience with LLM frameworks such as :

- LangChain

- LlamaIndex

- Semantic Kernel

- Experience with vector databases and embeddings

- Hands-on knowledge of RAG architectures

- Understanding of transformer models and NLP concepts

- Experience working with REST APIs, microservices, and backend systems

- Familiarity with cloud platforms (AWS, Azure, or GCP)

- Experience with Docker, Kubernetes, and scalable infrastructure

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

HR Team

HR Manager at YO IT CONSULTING

Last Active: 28 Apr 2026

Job Views:
313

Applications: 94

Recruiter Actions: 26

Posted in

AI/ML

Functional Area

ML / DL / AI Research

Job Code

1619015

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers