HamburgerMenu
hirist

Job Description

We are seeking an experienced AI Solution Architect to design, develop, and scale next-generation GenAI-powered microservices.

This role involves architecting multi-agent systems, building RAG pipelines, and deploying large-scale LLM applications using Google Cloud services.

You will play a key role in shaping the architecture of AI-driven solutions while ensuring security, performance, and scalability.


Key Responsibilities :


API & Microservices Development :


- Design and implement robust asynchronous APIs using FastAPI for GenAI microservices.

- Ensure request routing, rate limiting, error tracking, and observability for production-grade systems.


Multi-Agent Orchestration :


- Architect multi-agent systems using LangGraph, CrewAI, or similar frameworks.

- Implement dynamic workflows with LangChain Expression Language (LCEL) and tool/function calling for complex task orchestration.


RAG & Knowledge Systems :


- Build retrieval-augmented generation (RAG) pipelines with advanced chunking, metadata tagging, and vector search integration.

- Work with vector databases such as FAISS, Pinecone, and GCP Matching Engine.


Caching & State Management :


- Develop session management layers and caching mechanisms using Redis (pub/sub, aioredis) to enable memory and persistence in real-time chat systems.


Cloud Deployment & LLM Optimization :


- Deploy and optimize LLM applications on Google Cloud Platform (Vertex AI, Cloud Run, Storage, IAM, Matching Engine).

- Integrate embedding models from OpenAI, Cohere, and Gemini.


Security & Compliance :


- Implement API key management, JWT-based authentication, and audit logging.

- Maintain industry-standard security best practices across deployments.


Required Skills & Qualifications :


- 5+ years of backend engineering experience in Python.

- Strong expertise in FastAPI with async/await, background tasks, dependency injection, and exception handling.

- Hands-on experience with LangChain, LangGraph, LCEL, and multi-agent systems.

- Proficiency in Redis (pub/sub, async clients, caching layers) for conversation state and memory.

- Strong knowledge of Google Cloud Platform (Vertex AI, Cloud Run, IAM, Storage, Matching Engine).

- Familiarity with vector databases (FAISS, Pinecone, GCP Matching Engine) and embedding models (OpenAI, Cohere, Gemini).

- Experience with tool/function calling, session tracking, and context management in LLMs.

- Proficiency with Docker and building scalable microservice architectures.


Preferred Skills (Nice to Have) :


- Exposure to observability tools (Prometheus, Grafana, OpenTelemetry).

- Familiarity with CI/CD pipelines and automated deployments.

- Experience in fine-tuning or custom training of LLMs.

- Knowledge of MLOps practices for AI/ML model lifecycle management.


info-icon

Did you find something suspicious?