Job Overview:

We are seeking a highly skilled AI Engineer with extensive experience in backend systems and hands-on work with Large Language Models (LLMs), AI agents, and GenAI architectures. In this role, you will design, build, and deploy intelligent systems powered by state-of-the-art models and frameworks. You will be responsible for architecting scalable AI pipelines, integrating foundation model APIs, managing vector databases, and orchestrating complex multi-agent workflows.

This position requires deep technical expertise, strong system design skills, and the ability to work cross-functionally to transform high-level ideas into production-grade AI capabilities.

Key Responsibilities:

- Design and implement end-to-end AI systems integrating LLMs, embeddings, vector search, and reasoning components.

- Build scalable pipelines for retrieval-augmented generation (RAG), memory management, contextual reasoning, and long-term agent workflows.

- Architect microservice-based and event-driven AI services using asynchronous Python, queues, and streaming systems.

- Develop, fine-tune, orchestrate, and optimize LLM-powered agents, including tool-using, multi-step reasoning, and autonomous task agents.

- Implement multi-agent coordination patterns (e.g., MCP, planner-executor, supervisor-worker, or graph-based orchestration).

- Conduct prompt engineering, model evaluation, and iterative refinement to achieve accuracy, relevance, and safety.

- Leverage frameworks such as LangChain, LangGraph, Haystack, or custom orchestrators for multi-step reasoning workflows.

- Integrate APIs from OpenAI, Anthropic, Cohere, Google, and/or open-source models (Llama, Mistral, DeepSeek, etc.).

- Manage, query, and optimize vector databases such as Pinecone, Weaviate, Milvus, or built-in vector stores.

- Develop robust, production-ready backend services using Python (FastAPI, Flask, Django, or similar).

- Implement asynchronous processing using tools like Celery, Kafka, RabbitMQ, Redis Streams, or cloud-native event systems.

- Design APIs and data services for internal and external AI-powered features.

- Ensure reliability, scalability, logging, observability, and efficient resource usage in AI workloads.

- Design embedding pipelines, chunking strategies, and hybrid search approaches (semantic + keyword + metadata filtering).

- Build and maintain prompt templates, caching layers, and memory systems (episodic, vector, extended context).

- Apply best practices to ensure data security, governance, and filtering in compliance with policies or enterprise standards.

- Profile and optimize inference costs, latency, and throughput across agents and pipelines.

- Conduct A/B testing, evaluation with benchmark datasets, telemetry-based improvement, and error analysis.

- Troubleshoot model hallucination, prompt drift, and retrieval inconsistencies.

- Work closely with Product, Data Science, and Backend teams to translate requirements into AI capabilities.

- Conduct code reviews, provide architecture guidance, and mentor junior engineers.

- Participate in roadmap discussions and identify opportunities to introduce advanced AI-driven solutions.

Required Skills & Qualifications:

Technical Expertise:

- 5+ years of backend engineering experience, including microservices, distributed systems, and API development.

- 2+ years hands-on experience with LLMs or AI agents, including production deployment.

- Strong Python engineering skills with experience in asyncio, event-driven design, and scalable architectures.

- Deep understanding of GenAI concepts:

a.Embeddings, vector search

b. RAG pipelines

c. Prompt engineering and templates

d. Memory architectures (vector memory, long-term memory, structured memory)

- Experience with orchestration frameworks: LangChain, LangGraph, Haystack, or custom graph-based workflows.

- Familiarity with LLM APIs: OpenAI, Anthropic, Cohere, Google Vertex, AWS Bedrock, etc.

- Hands-on experience with vector stores: Pinecone, Weaviate, Milvus, or pgvector.

- Strong understanding of relational and NoSQL data stores such as PostgreSQL and Redis.