Posted on: 06/08/2025
About the Role :
We are looking for a highly skilled AI Engineer with deep expertise in Large Language Models (LLMs), multi-modal AI systems, and real-time data processing to join our advanced AI team. The ideal candidate will design, develop, and optimize cutting-edge AI applications leveraging state-of-the-art frameworks and architectures. You will work on building scalable, low-latency AI services, natural language interfaces, and multi-modal retrieval systems that push the boundaries of current AI capabilities.
Key Responsibilities :
- Build and optimize agentic patterns such as ReAct, ReWoo, and utilize LLMCompiler to enhance LLM capabilities.
- Design advanced prompting strategies including Chain of Thought (CoT), LLM Judge, and self-reflection prompting to improve model reasoning and output quality.
- Implement prompt compression and optimization using tools like LLMLingua, AdaFlow, TextGrad, and DSPy to maximize efficiency within limited context windows.
- Manage context windows and optimize prompt designs to balance performance with computational constraints.
- Develop and maintain multi-modal AI systems that process and integrate text, images, audio, and video data.
- Design and implement chunking and clustering strategies for efficient processing of large, heterogeneous data sources.
- Architect and develop audio/video streaming pipelines and real-time data processing systems to support low-latency inference and interactive AI applications.
- Ensure scalable and robust real-time AI service deployment with minimal latency.
- Build natural language-driven SQL query generation engines to facilitate intuitive data querying for end-users.
- Optimize generated SQL queries for accuracy and performance across various database systems.
- Design and develop scalable, secure APIs for AI model serving using FastAPI or equivalent frameworks.
- Containerize AI services using Docker and deploy them efficiently with orchestration tools such as Kubernetes or Docker Swarm.
- Create data pipelines capable of handling large-scale document and multimedia data ingestion, chunking, preprocessing, and indexing for model training and inference.
- Implement chunking strategies and clustering algorithms to improve data retrieval and model performance.
- Utilize AI/ML frameworks including TensorFlow, PyTorch, and specialized LLM frameworks such as LangChain and LangGraph for model development.
- Stay updated with emerging LLM technologies and integrate best-in-class tools into the AI stack.
Qualifications & Skills :
- Proven experience in developing, fine-tuning, and deploying Large Language Models (LLMs) and multi-modal AI systems.
- Strong expertise with LangChain, LangGraph, LLMCompiler, and other LLM-related technologies.
- Hands-on experience designing agentic AI patterns like ReAct, ReWoo, and implementing RAG systems across multiple modalities.
- Experience with streaming technologies and real-time inference architectures for AI applications.
- Proficiency in natural language interface design and NL2SQL implementations, including query optimization.
- Skilled in building scalable AI APIs with FastAPI and containerizing applications with Docker.
- Familiarity with container orchestration platforms like Kubernetes for production deployment.
- Solid understanding of data engineering principles, especially chunking strategies and clustering for large datasets.
- Proficient in Python programming and AI/ML frameworks such as TensorFlow and PyTorch.
- Expertise in advanced prompt engineering techniques and tools for prompt optimization and context management.
- Strong analytical mindset with the ability to design efficient algorithms and architectures for AI model performance.
Did you find something suspicious?