HamburgerMenu
hirist

Job Description

Description : Speech Data Scientist


Experience Level : 3-6 years



Location : Bangalore, India

Key Responsibilities :

Core Development & Implementation :


- Design and implement end-to-end speech analytics pipelines for production environments


- Develop ASR engines using state-of-the-art frameworks (Wav2vec, Whisper, Deep Speech) with PyTorch or TensorFlow


- Build and optimize speaker diarization, language identification (LID), and text post-processing systems


- Focus on multilingual audio processing capabilities


- Lead data selection strategies for domain adaptation and model optimization

Model Development & Enhancement :


- Develop and analyze objective measures for speech quality evaluation and enhancement


- Implement speaker-conditioned personalization techniques for improved ASR accuracy in noisy environments


- Optimize on-device ASR models with emphasis in multilanguage scenarios


- Guide teams on best practices for model accuracy improvement and performance optimization

Research & Innovation :


- Conduct research on advanced speech processing techniques including neural speech enhancement


- Develop novel approaches for complex audio scenarios and multi-speaker environments


- Contribute to patent applications and research publications in speech technology


- Stay current with latest developments in transformer models, attention mechanisms, and foundation models

Technical Integration & Deployment :


- Design integration architectures for speech-to-text services and supporting technologies


- Implement MLOps processes and CI/CD pipelines for speech models


- Deploy and scale speech solutions on cloud platforms (AWS, GCP)


- Develop production-ready applications using Python, C++, and Java

Educational Background :


- Ph.D/M.Tech/M.S in relevant field (Computer Science / Signal Processing) preferred


- B.Tech/B.E in ECE, CSE, or related technical field

Technical Expertise :

Core Speech Processing :


- 3-6 years of hands-on experience in speech recognition and processing


- Deep understanding of classical methodologies: HMMs, GMMs, ANNs, Language modeling


- Expertise in modern deep learning techniques: CNNs, RNNs, LSTMs, CTC, Attention mechanisms


- Strong background in digital signal processing and audio analysis

Machine Learning & Deep Learning :


- Proficiency with PyTorch and TensorFlow frameworks


- Experience with transformer models (BERT, Wav2vec 2.0, Wisper)


- Knowledge of end-to-end ASR implementation and optimization


- Understanding of foundation models and transfer learning approaches

Programming & Tools :


- Strong Python programming skills with ML/DL libraries (numpy, pandas, scikit-learn)


- Experience with C++ and Java for production implementations


- Proficiency in bash scripting and automation


- Familiarity with version control (Git) and collaborative development

Cloud & Deployment :


- Hands-on experience with cloud platforms (AWS, GCP)


- Knowledge of containerization (Docker, Kubernetes)


- Experience with MLOps tools and CI/CD pipelines


- Understanding of model serving and scalability considerations


info-icon

Did you find something suspicious?