HamburgerMenu
hirist

Justdial - Senior Architect - Voice AI Solutions

Posted on: 03/02/2026

Job Description

Description :

Job Summary :

As a Senior Architect, you will lead the design and scaling of end-to-end Voice AI solutions tailored for Indian market. You will architect the "brain" and "ears" of our agents, ensuring they move beyond simple chatbots to become human-like entities capable of managing complex, code-switched conversations (like Hinglish or Tanglish) with ultra-low latency and cultural intelligence.

Key Responsibilities :

1. System Architecture & Multilingual Orchestration :

- Voice Stack Design : Build the cascading architecture linking STT (Speech-to-Text), LLM (Reasoning), and TTS (Text-to-Speech).

- Indic Language Strategy : Implement real-time Language Identification (LID) modules to detect transitions between any of the 22 official Indian languages within the first few syllables.

- Latency Optimization : Architect streaming audio pipelines to achieve a "Time to First Byte" (TTFB) under 200ms, ensuring the agent feels instantaneous even during complex language shifts.

- Code-Switching Logic : Design NLU (Natural Language Understanding) frameworks specifically capable of handling Hinglish, Tanglish, and Benglish seamlessly.

- Dialogue Management : Build sophisticated logic for turn-taking, handling user interruptions, and maintaining long-term conversational memory.

2. Technical Leadership & Linguistic Engineering :


- Dialectal Robustness : Develop ASR pipelines that are resilient to diverse Indian accents and noisy environments (e.g., bustling markets or public transport).

- Tool & API Integration : Architect how voice agents interact with external systems using function calling, ensuring the agent can "act" on what it hears.

- Indic-Specific TTS : Oversee the tuning of TTS models to ensure natural prosody, emotional inflection, and correct pronunciation of local names, addresses, and cultural idioms.

- Tokenization Efficiency : Optimize LLM tokenizers for Indic scripts (Devanagari, Tamil, Telugu, etc.) to reduce computational costs and improve processing speed

3. Security, Ethics & Quality :

- PII & Compliance: Implement rigorous safeguards for PII redaction in voice transcripts, adhering to data privacy standards.

- Bias Mitigation: Establish frameworks to prevent "hallucinations" and ensure the agent remains neutral and respectful across different cultural and linguistic contexts.

Required Skills & Qualifications :

- AI Frameworks : Mastery of LangChain, LangGraph, or specialized voice SDKs (e.g., Pipecat, Livekit, Vercel AI SDK).

- Indic NLP Stack : Hands-on experience with Bhashini, Sarvam AI, or BharatGen models and libraries like iNLTK.

- Real-time Protocols : Deep proficiency in WebSockets and WebRTC for lag-free audio streaming.

Experience : 5-8 years in software architecture, with 2-3 years dedicated to NLP or Conversational AI.

info-icon

Did you find something suspicious?

Similar jobs that you might be interested in