Posted on: 16/04/2026
Description :
Role Summary :
The Associate AI/ML Engineer supports the development of machine learning and multimodal capabilities across text, image, audio, and video. This role focuses on data engineering tasks, implementing baseline models, and assisting senior engineers in building production-ready AI features for media workflows such as content tagging, speech transcription, image understanding, and personalized recommendations.
Detailed Responsibilities :
Model Development & Research :
- Train small-scale multimodal models using datasets containing images, audio, and video frames.
- Conduct experiments with OCR, ASR, TTS, object detection, and emotion/sentiment detection.
Data Preparation & Feature Engineering :
- Extract features from audio/video using libraries like OpenCV, ffmpeg, librosa.
- Support dataset creation for multimodal tasks (frame extraction, audio segmentation).
Pipeline Development :
- Assist in building training and evaluation pipelines (data loaders, augmentation pipelines).
- Support deployment of inference APIs using FastAPI/Flask.
Testing, Documentation, and Code Quality :
- Document workflows, model behaviors, and experiment results.
Cross-functional Collaboration :
- Work with senior ML engineers, data engineers, and media teams to support AI features.
- Assist in preparing demos, POCs, and reports.
Must Have Skills :
- Python, Pandas, NumPy, Scikit-learn.
- Basics of deep learning (CNNs/RNNs/Transformers).
- Familiarity with PyTorch/TensorFlow, OpenCV, librosa.
- Understanding of media formats: MP4, WAV, PNG/JPEG, subtitles.
- Exposure to cloud ML services (Azure Video Indexer, AWS Rekognition).
Preferred Skills :
- Experience with pre-trained multimodal models (CLIP, Whisper, ViT).
- Basic understanding of embeddings and vector databases.
- Familiarity with data labeling and annotation tools.
Qualifications :
- Bachelors in CS, AI, Data Science, or equivalent.
- Internships or academic projects in machine learning or media AI.
Did you find something suspicious?