Job Title : AI Engineer Voice & Speech Technologies

Experience Range : 5- 10 Years

Location : Bangalore

Overview:

We are seeking a highly skilled AI/ML Engineer specializing in Voice, Speech, and Conversational AI technologies. The ideal candidate will have proven hands-on experience in designing intelligent conversational systems, working with LLMs and Generative AI, and integrating speech-driven interactions into real-time telephony environments. This role involves developing cutting-edge voicebots, chatbots, speech pipelines, and telephony logic while ensuring seamless integration across multiple communication channels.

This is an opportunity to shape the future of automated voice interaction systems by leveraging state-of-the-art AI frameworks, speech models, and telephony infrastructure.

Key Responsibilities :

- Design, develop, and optimize AI-driven conversational experiences across chat and voice platforms.

- Build and enhance NLU/NLP models for intent recognition, entity extraction, and context-aware dialogue management.

- Create, evaluate, and refine speech-based interaction flows using STT (speech-to-text) and TTS (text-to-speech) technologies.

- Apply Generative AI and LLM capabilities to enhance conversational quality, reasoning, personalization, and natural language generation.

- Develop and integrate AI systems with backend services and enterprise platforms using REST APIs and microservices architecture.

- Implement and maintain SIP-based call flows, VoIP routing, and telephony logic for real-time inbound and outbound interactions.

- Ensure reliable and low-latency communication between AI components, telephony infrastructure, and application layers.

- Monitor, evaluate, and fine-tune NLP/ASR/TTS model performance based on metrics like accuracy, speed, stability, and user experience.

- Participate in solution design, architecture reviews, performance tuning, and deployment workflows across cloud environments.

Technical Skills & Requirements :

- Strong proficiency in Python for AI development, automation, and backend logic.

- Expertise with RESTful services, JSON structures, and API integration workflows.

- Proficiency with relational and non-relational databases (SQL/NoSQL).

- Solid understanding of LLMs, Generative AI, embeddings, RAG, vector storage, and AI retrieval architectures.

- Experience with NLP/NLU platforms, ASR systems, conversational AI frameworks, and dialog orchestration pipelines.

- Hands-on experience with TTS/STT tech stacks such as Whisper, Amazon Polly, Google Speech API, Azure Speech, Deepgram, or similar.

- Strong understanding of SIP, VoIP, telephony networks, TCP/IP, PBX frameworks, and real-time communication protocols.

- Practical experience with telephony and communication platforms such as Twilio, Asterisk, FreeSWITCH, WebRTC, or comparable services.

- Familiarity with modern cloud infrastructure (AWS, Azure, GCP) for deploying scalable AI-driven voice solutions.

- Experience with CI/CD pipelines, containerization (Docker, Kubernetes) is a bonus.

Soft Skills :

- Strong analytical, debugging, and technical problem-solving capabilities.

- Effective communicator able to translate technical workflows into stakeholder-friendly language.

- Ability to design human-like conversational experiences with empathy and usability in mind.

- Highly collaborative mindset with adaptability in fast-paced and evolving environments.

- Detail-oriented with a strong sense of ownership and commitment to high-quality delivery.

One-Line Tech Stack :

- Python, NLP/NLU