Posted on: 15/01/2026
Role Overview :
We are seeking a highly skilled Python AI Engineer to lead the design and development of advanced agentic workflows and large-scale LLM systems.
This role sits at the intersection of AI engineering, prompt operations, evaluation pipelines, and production-grade deployment.
The ideal candidate brings deep expertise in agent-based architectures, LLM evaluation and tracing, MCP-based integrations, and has successfully deployed LLM-powered systems at scale.
You will work closely with product and engineering teams to deliver reliable, enterprise-ready AI solutions.
Key Responsibilities :
Agentic Systems & LLM Engineering :
- Design, develop, and optimize agentic workflows using modern agent frameworks.
- Build multi-step, tool-augmented, and context-aware AI agents for complex use cases.
- Architect MCP (Model Context Protocol) based system integrations for modular and scalable AI solutions.
LLM Evaluation, Tracing & PromptOps :
- Develop and maintain LLM evaluation pipelines, including benchmarking, regression testing, and quality scoring.
- Implement LLM tracing, observability, and debugging workflows to improve reliability and explainability.
- Own prompt engineering and prompt operations, including versioning, testing, optimization, and lifecycle management.
Production Deployment & Scalability :
- Deploy and manage AI/ML and LLM systems in production environments with high availability and performance.
- Design scalable inference architectures supporting latency, throughput, and cost optimization.
- Implement best practices for MLOps, CI/CD, monitoring, and model governance.
Cross-Functional Collaboration :
- Collaborate with product managers, backend engineers, and data teams to translate business requirements into AI solutions.
- Participate in architectural discussions and contribute to AI platform and roadmap decisions.
- Take ownership of features and systems from concept to production and post-deployment support.
Key Result Areas (KRAs):
- Successful delivery of production-grade agentic AI systems
- Reliability and performance of LLM evaluation and tracing pipelines
- Scalability and stability of deployed AI solutions
- Quality and effectiveness of prompt engineering and agent behavior
- Stakeholder satisfaction and cross-team collaboration
Required Skills & Qualifications :
Core Technical Skills :
- Strong proficiency in Python with hands-on experience in AI/ML libraries and frameworks.
- Proven experience with agentic frameworks such as LangChain, LlamaIndex, or similar.
- Hands-on experience with LLM evaluation and observability tools such as LangSmith, Arize, DeepEval, or equivalents.
- Deep understanding of prompt engineering and prompt operations.
- Strong experience deploying and managing AI/ML models in production environments.
Cloud & MLOps :
- Experience with at least one major cloud platform: AWS, Azure, or GCP.
- Solid understanding of MLOps practices, CI/CD pipelines, model monitoring, and versioning.
- Familiarity with containerization and scalable deployment patterns.
Tools & Ways of Working :
- Hands-on experience using GitLab, Trello, Zoom, or similar collaboration and DevOps tools.
- Ability to work independently, take ownership, and drive problems to resolution.
- Comfortable working in fast-paced, high-pressure environments with multiple priorities.
Behavioral & Professional Competencies :
- Strong problem-solving and analytical skills.
- Excellent communication and stakeholder management abilities.
- Self-starter mindset with a strong sense of ownership and accountability.
- Keen interest in continuously learning new AI technologies, business processes, and engineering practices.
Why Join Us :
- Work on cutting-edge LLM and agentic AI systems at scale.
- High ownership role with visibility across product and engineering leadership.
- Opportunity to influence AI architecture, tooling, and best practices.
- Collaborative, innovation-driven engineering culture
Did you find something suspicious?