You will design and ship state-of-the-art ASR models and production pipelinesbridging classical signal-processing foundations with modern transformer-based speech models to drive measurable product impact.

Role & Responsibilities :

- Lead design, training and optimisation of ASR systemsend-to-end and hybridusing transformer and sequence modeling (Wav2Vec 2.0, Whisper, CTC, attention-based encoders/decoders).

- Develop and evaluate speech pre-processing and DSP pipelines (feature extraction, augmentation, denoising, VAD) to improve robustness across noisy, multilingual inputs.

- Prototype and productionise model-serving solutions : containerised inference, latency optimisation, batching, and autoscaling for cloud and edge deployments.

- Collaborate with data engineers and linguists to curate datasets, define annotation guidelines, and run rigorous evaluation (WER, CER, streaming metrics) and error-analysis cycles.

- Implement reproducible training workflows, CI/CD for models, monitoring for drift and performance, and automation for retraining and A/B evaluation.

- Mentor peers, author engineering-excellence patterns (testing, observability), and present technical results to product and stakeholder teams.

Skills & Qualifications :

Must-Have :

- 5+ years in speech recognition or related audio ML roles with proven production impact.

- Strong DSP and audio analysis fundamentals (feature engineering, spectrograms, filtering, VAD).

- Hands-on experience with PyTorch and/or TensorFlow for building and training ASR models.

- Practical knowledge of transformer-based speech models (Wav2Vec 2.0, Whisper) and sequence losses (CTC), plus RNN/CNN architectures.

- Proficient in Python; experience with C++/Java for production deployments is highly desirable.

- Experience deploying models in cloud environments (AWS/GCP) and container orchestration (Docker/Kubernetes); familiar with MLOps tooling and CI/CD.

Preferred :

- Background in multilingual ASR, low-resource languages, or on-device/edge inference optimisation.

- Experience with large-scale data pipelines, annotation platforms, and semi-supervised / self-supervised learning workflows.

- Familiarity with production monitoring (prometheus/grafana), model explainability, and privacy-preserving ML techniques.

Benefits & Culture Highlights :

- High-autonomy engineering culture with strong emphasis on ownership, mentorship, and career growth.

- Opportunity to influence product direction and work on state-of-the-art speech models at scale.

- Competitive compensation, flexible hybrid work, and learning budget for conferences and training.

We are seeking a results-oriented Speech Scientist who thrives on technical ownership and delivering dependable voice AI in real-world settings.

Apply if you want to push ASR boundaries and build production-grade speech systems that scale.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Recruiter at Albatronix

Last Active: 25 Sep 2025

Job Views:
24

Applications: 16

Recruiter Actions: 0

Posted in

AI/ML

Functional Area

Data Science

Job Code

1551475

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers