Description :

Job Summary :

We are seeking a highly skilled Senior GenAI Engineer (Multi-Agent Systems) to play a key role in advancing our AI-driven platform and client solutions.

The ideal candidate has 4+ years of hands-on Machine Learning (ML) experience beyond academia, thrives in fast-paced environments, and enjoys solving complex technical challenges.

A strong foundation in cloud-based ML solutions, AI model deployment, and optimization techniques is essential.

Those with a passion for staying at the forefront of AI advancements and delivering high-impact solutions will excel in this role.

Key Responsibilities :

- Architect and scale multi-agent orchestration frameworks (e.g., LangGraph, AutoGen, LlamaIndex, CrewAI)

- Implement agent-to-agent communication, memory and state management

- Build evaluation and feedback loops (e.g., latency, success-rate, hallucination control, cost)

- Integrate agents with external APIs, data systems and event streams

- Ensure scalability, observability and MLOps for agent workflows

- Contribute to building and enhancing our Mechanized AI platform and mAI Modernize product suite

- Serve as ML SME on client projects, as needed

- Design ML systems

- Research and implement appropriate ML algorithms and tools

- Select appropriate datasets and data representation methods

- Run ML tests and experiments

- Extend existing ML libraries and frameworks

- Stay current with emerging technologies and ML best practices to continuously improve our methodologies and tools

Required Skills & Experience :

- 5+ years of professional software (AI preferred) engineering

- 2+ years building production LLM or multi-agent systems

- Practical proficiency with graph databases (e.g., TigerGraph, Neo4j, etc) and graph-based retrieval (e.g., GraphRAG), plus experience with vector databases like Opensearch, CosmosDB, or Pinecone

- Deep experience with one or more : LangGraph, AutoGen 0.4, LlamaIndex AgentWorkflow, ag2, strands agents, etc

- Retrieval for agents : OpenSearch plus TigerGraph and GraphRAG patterns for global reasoning

- Observability : Langfuse or Phoenix with OpenTelemetry; track quality, cost, and latency

- Serving : Deploy and scale agentic workloads via AWS Fargate, Step Functions, and Lambdas (or equivalent services on other clouds), ensuring p95 latency targets

- Safety by design aligned to OWASP LLM Top 10; implement policy-compliant tool scopes and output validation.

- Interoperability : Model Context Protocol (MCP) for tool and data access, with bonus for LangGraph MCP adapters

- Experience with cloud environments (e.g., AWS, Azure, GCP) : ECS, Step Functions, Lambda, and equivalents on other clouds.

- Experience developing, deploying, and managing/monitoring non-open source LLMs

- Knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture

- Expertise in Object-Oriented Programming (OOP) principles and unit test-driven development methodologies

- Retrieval-augmented generation (RAG) optimization

- Advanced experience in NLP techniques and applications

- Strong proficiency in Python programming

- Familiarity with prompt engineering approaches and best practices

- Knowledge of data structures, data modeling, and software architecture

- Effective written and oral communication skills (C1/C2 - advanced/proficient level English is required)