Posted on: 14/04/2026
Description :
1. Speech-to-Text (STT)
2. LLM orchestration
3. Text-to-Speech (TTS)
4. Streaming audio response
- Integrate with telephony and IVR providers for inbound and outbound voice automation
- Implement WebSocket-based real-time audio streaming pipelines for low-latency voice interaction
- Work with WebRTC or media streaming architectures for real-time communication
- Integrate with voice model providers for :
1. Speech recognition
2. Streaming transcription
3. Neural text-to-speech
4. Voice-to-voice models
- Build scalable backend services to handle high concurrent voice calls
Optimize systems for :
1. Low latency
2. Streaming performance
3. Audio buffering
4. Real-time responsiveness
- Collaborate with product teams to build production-ready voice AI applications
- Design event-driven architectures for voice interaction pipelines
- Work closely with frontend, AI, and telephony teams to deliver end-to-end voice products.
Must-Have Criteria :
Core Requirements :
- Hands-on experience with voice bots / conversational AI systems
- Experience building GenAI solutions in production (LLMs, RAG, orchestration)
- Solid understanding of real-time audio streaming (WebSockets, streaming pipelines)
- Knowledge of voice AI architecture (STT ,LLM , TTS, turn management)
- Experience with telephony / IVR systems and call flows
Strong fundamentals in :
1. Distributed systems & microservices
2. Event-driven and API-driven architectures
3. Async programming & concurrency (Python)
4. Experience integrating voice/LLM model providers
5. Ability to debug latency and real-time streaming issues
Good to Have :
- Familiarity with WebRTC / LiveKit / real-time streaming frameworks
- Exposure to GPU/accelerator-based inference
- Experience in call center / voice agent systems
- Knowledge of VAD, interruption handling, multilingual voice systems
- Experience handling high-concurrency voice applications
Note : Only immediate joiners preferred or max 15 days
Did you find something suspicious?