Posted on: 04/09/2025
We are seeking an experienced AI Solution Architect to design, develop, and scale next-generation GenAI-powered microservices.
This role involves architecting multi-agent systems, building RAG pipelines, and deploying large-scale LLM applications using Google Cloud services.
You will play a key role in shaping the architecture of AI-driven solutions while ensuring security, performance, and scalability.
Key Responsibilities :
API & Microservices Development :
- Design and implement robust asynchronous APIs using FastAPI for GenAI microservices.
- Ensure request routing, rate limiting, error tracking, and observability for production-grade systems.
Multi-Agent Orchestration :
- Architect multi-agent systems using LangGraph, CrewAI, or similar frameworks.
- Implement dynamic workflows with LangChain Expression Language (LCEL) and tool/function calling for complex task orchestration.
RAG & Knowledge Systems :
- Build retrieval-augmented generation (RAG) pipelines with advanced chunking, metadata tagging, and vector search integration.
- Work with vector databases such as FAISS, Pinecone, and GCP Matching Engine.
Caching & State Management :
- Develop session management layers and caching mechanisms using Redis (pub/sub, aioredis) to enable memory and persistence in real-time chat systems.
Cloud Deployment & LLM Optimization :
- Deploy and optimize LLM applications on Google Cloud Platform (Vertex AI, Cloud Run, Storage, IAM, Matching Engine).
- Integrate embedding models from OpenAI, Cohere, and Gemini.
Security & Compliance :
- Implement API key management, JWT-based authentication, and audit logging.
- Maintain industry-standard security best practices across deployments.
Required Skills & Qualifications :
- 5+ years of backend engineering experience in Python.
- Strong expertise in FastAPI with async/await, background tasks, dependency injection, and exception handling.
- Hands-on experience with LangChain, LangGraph, LCEL, and multi-agent systems.
- Proficiency in Redis (pub/sub, async clients, caching layers) for conversation state and memory.
- Strong knowledge of Google Cloud Platform (Vertex AI, Cloud Run, IAM, Storage, Matching Engine).
- Familiarity with vector databases (FAISS, Pinecone, GCP Matching Engine) and embedding models (OpenAI, Cohere, Gemini).
- Experience with tool/function calling, session tracking, and context management in LLMs.
- Proficiency with Docker and building scalable microservice architectures.
Preferred Skills (Nice to Have) :
- Exposure to observability tools (Prometheus, Grafana, OpenTelemetry).
- Familiarity with CI/CD pipelines and automated deployments.
- Experience in fine-tuning or custom training of LLMs.
- Knowledge of MLOps practices for AI/ML model lifecycle management.
Did you find something suspicious?
Posted by
Mrinmoyee Roy Chowdhury
Talent Acquisition Lead at CAPITALNUMBERS INFOTECH LIMITED
Last Active: 15 Dec 2025
Posted in
AI/ML
Functional Area
Technical / Solution Architect
Job Code
1540924
Interview Questions for you
View All