Posted on: 24/09/2025
About The Opportunity :
Join a high-velocity engineering team building robust, low-latency speech and voice solutions for large-scale deployments.
You will design and ship state-of-the-art ASR models and production pipelinesbridging classical signal-processing foundations with modern transformer-based speech models to drive measurable product impact.
Role & Responsibilities :
- Lead design, training and optimisation of ASR systemsend-to-end and hybridusing transformer and sequence modeling (Wav2Vec 2.0, Whisper, CTC, attention-based encoders/decoders).
- Develop and evaluate speech pre-processing and DSP pipelines (feature extraction, augmentation, denoising, VAD) to improve robustness across noisy, multilingual inputs.
- Prototype and productionise model-serving solutions : containerised inference, latency optimisation, batching, and autoscaling for cloud and edge deployments.
- Collaborate with data engineers and linguists to curate datasets, define annotation guidelines, and run rigorous evaluation (WER, CER, streaming metrics) and error-analysis cycles.
- Implement reproducible training workflows, CI/CD for models, monitoring for drift and performance, and automation for retraining and A/B evaluation.
- Mentor peers, author engineering-excellence patterns (testing, observability), and present technical results to product and stakeholder teams.
Skills & Qualifications :
Must-Have :
- 5+ years in speech recognition or related audio ML roles with proven production impact.
- Strong DSP and audio analysis fundamentals (feature engineering, spectrograms, filtering, VAD).
- Hands-on experience with PyTorch and/or TensorFlow for building and training ASR models.
- Practical knowledge of transformer-based speech models (Wav2Vec 2.0, Whisper) and sequence losses (CTC), plus RNN/CNN architectures.
- Proficient in Python; experience with C++/Java for production deployments is highly desirable.
- Experience deploying models in cloud environments (AWS/GCP) and container orchestration (Docker/Kubernetes); familiar with MLOps tooling and CI/CD.
Preferred :
- Background in multilingual ASR, low-resource languages, or on-device/edge inference optimisation.
- Experience with large-scale data pipelines, annotation platforms, and semi-supervised / self-supervised learning workflows.
- Familiarity with production monitoring (prometheus/grafana), model explainability, and privacy-preserving ML techniques.
Benefits & Culture Highlights :
- High-autonomy engineering culture with strong emphasis on ownership, mentorship, and career growth.
- Opportunity to influence product direction and work on state-of-the-art speech models at scale.
- Competitive compensation, flexible hybrid work, and learning budget for conferences and training.
We are seeking a results-oriented Speech Scientist who thrives on technical ownership and delivering dependable voice AI in real-world settings.
Apply if you want to push ASR boundaries and build production-grade speech systems that scale.
Did you find something suspicious?