Posted on: 14/04/2026
About the Role :
We are hiring a Senior AI Engineer, Voice AI to build state-of-the-art voice AI systems for real-world customer conversations. The ideal candidate has deep experience across speech and language layers, including ASR, diarization, multilingual voice pipelines, LLM prompting and optimization, summarization, extraction, and low-latency production deployment.
This is not a pure integration role. We are looking for someone who can operate close to the model and systems layer, make strong architectural trade-offs, and build reliable conversational AI under real-world constraints such as noisy telephony audio, multilingual users, interruptions, latency sensitivity, and business-critical workflows.
You will work on production voice agents and conversational intelligence systems that power customer interactions at scale, across live conversations, post-call analytics, and automation workflows.
Key Responsibilities :
- Design, build, and improve real-time Voice AI systems for customer conversations.
- Develop and optimize systems across the voice stack, including :
1. Automatic speech recognition (ASR)
2. Speaker diarization
3. Voice activity detection (VAD)
4. Language identification
5. LLM reasoning and orchestration
6. Response generation
7. Text-to-speech (TTS)
8. Interruption handling and turn-taking
- Build low-latency, production-grade inference pipelines for live voice interactions.
- Improve multilingual and code-switched conversational performance for real-world telephony environments.
- Design prompt workflows, structured outputs, tool usage, and fallback logic to improve accuracy, control, and resolution rates.
- Build and refine systems for :
- Summarization
- Information extraction
- Classification
- Sentiment and emotion analysis
- Call dispositioning
- Topic and intent detection
- Quality monitoring
- Define and implement evaluation frameworks for conversational quality, latency, hallucination, containment, task completion, extraction accuracy, and business outcomes.
- Fine-tune and optimize LLMs and speech models for domain-specific performance where needed.
- Work closely with product, engineering, and operations teams to translate ambiguous business problems into scalable AI systems.
- Troubleshoot production failures across speech, orchestration, and model behavior, and drive measurable improvements.
Required Qualifications :
- 5+ years of experience in AI/ML/NLP, with strong hands-on experience in Voice AI, Speech AI, or Conversational AI.
- Proven experience building or improving production voice systems used in real customer interactions.
- Strong understanding of the speech pipeline, including one or more of :
1. ASR
2. diarization
3. VAD
4. language identification
4. TTS
5. Telephony audio handling
- Strong experience with LLMs in production, including prompting, orchestration, evaluation, summarization, extraction, or agent workflows.
- Strong programming skills in Python and experience with frameworks such as PyTorch, TensorFlow, or similar.
- Experience with model serving, inference optimization, and production performance tuning.
- Strong grasp of system design trade-offs across latency, quality, reliability, and cost.
- Ability to work hands-on across experimentation, shipping, debugging, and iterative improvement.
Preferred Qualifications :
- Experience working in contact center AI, conversational analytics, or enterprise customer interaction products.
- Experience with multilingual speech systems, especially for Indian or other non-English language environments.
Familiarity with tools, frameworks, or models such as :
- Whisper
- Kaldi
- Hugging Face
- Streaming ASR pipelines
- Neural TTS systems
Experience with :
- Prompt optimization
- Fine-tuning
- PEFT / LoRA
- Distillation
- Evaluation harnesses
- Structured generation
- Experience designing systems for :
- Call summarization
- QA automation
- Compliance monitoring
- Agent assist
- Sentiment/emotion detection
- Conversation intelligence
- Familiarity with production infrastructure such as Docker, Kubernetes, Redis, message queues, observability tooling, and scalable API services.
Did you find something suspicious?