HamburgerMenu
hirist

Job Description

About the Role :


We are looking for a highly skilled AI Engineer with deep expertise in Large Language Models (LLMs), multi-modal AI systems, and real-time data processing to join our advanced AI team. The ideal candidate will design, develop, and optimize cutting-edge AI applications leveraging state-of-the-art frameworks and architectures. You will work on building scalable, low-latency AI services, natural language interfaces, and multi-modal retrieval systems that push the boundaries of current AI capabilities.


Key Responsibilities :


- Build and optimize agentic patterns such as ReAct, ReWoo, and utilize LLMCompiler to enhance LLM capabilities.


- Design advanced prompting strategies including Chain of Thought (CoT), LLM Judge, and self-reflection prompting to improve model reasoning and output quality.


- Implement prompt compression and optimization using tools like LLMLingua, AdaFlow, TextGrad, and DSPy to maximize efficiency within limited context windows.


- Manage context windows and optimize prompt designs to balance performance with computational constraints.


- Develop and maintain multi-modal AI systems that process and integrate text, images, audio, and video data.


- Design and implement chunking and clustering strategies for efficient processing of large, heterogeneous data sources.


- Architect and develop audio/video streaming pipelines and real-time data processing systems to support low-latency inference and interactive AI applications.


- Ensure scalable and robust real-time AI service deployment with minimal latency.


- Build natural language-driven SQL query generation engines to facilitate intuitive data querying for end-users.


- Optimize generated SQL queries for accuracy and performance across various database systems.


- Design and develop scalable, secure APIs for AI model serving using FastAPI or equivalent frameworks.


- Containerize AI services using Docker and deploy them efficiently with orchestration tools such as Kubernetes or Docker Swarm.


- Create data pipelines capable of handling large-scale document and multimedia data ingestion, chunking, preprocessing, and indexing for model training and inference.


- Implement chunking strategies and clustering algorithms to improve data retrieval and model performance.


- Utilize AI/ML frameworks including TensorFlow, PyTorch, and specialized LLM frameworks such as LangChain and LangGraph for model development.


- Stay updated with emerging LLM technologies and integrate best-in-class tools into the AI stack.


Qualifications & Skills :


- Proven experience in developing, fine-tuning, and deploying Large Language Models (LLMs) and multi-modal AI systems.


- Strong expertise with LangChain, LangGraph, LLMCompiler, and other LLM-related technologies.


- Hands-on experience designing agentic AI patterns like ReAct, ReWoo, and implementing RAG systems across multiple modalities.


- Experience with streaming technologies and real-time inference architectures for AI applications.


- Proficiency in natural language interface design and NL2SQL implementations, including query optimization.


- Skilled in building scalable AI APIs with FastAPI and containerizing applications with Docker.


- Familiarity with container orchestration platforms like Kubernetes for production deployment.


- Solid understanding of data engineering principles, especially chunking strategies and clustering for large datasets.


- Proficient in Python programming and AI/ML frameworks such as TensorFlow and PyTorch.


- Expertise in advanced prompt engineering techniques and tools for prompt optimization and context management.


- Strong analytical mindset with the ability to design efficient algorithms and architectures for AI model performance.


info-icon

Did you find something suspicious?