Job Description

Location : Gurugram Headquarters

Type : Full-Time

Experience Level : Mid-Senior

About PlanetSpark :

PlanetSpark is building the world's largest platform for communication skills mastery, empowering children aged 5 to 14 to become confident speakers and storytellers. With over 80,000 students, 185,000+ live classes monthly, and presence in over 22 countries, we're revolutionizing education with a blend of Live 1 : 1, Live Group, and AI-driven learning.

As we scale globally, we're integrating cutting-edge AI to provide personalized, data-driven, and engaging learning experiences - and we want you to help lead this transformation.

Role Overview :

We are looking for an innovative and hands-on AI Engineer who is passionate about leveraging multimodal AI to assess and improve children's communication skills. You will play a core role in designing and deploying an AI-based Communication Skills Scoring System - capable of real-time, human-like evaluation across speech, video, and behavioral inputs.

Key Responsibilities :

Multimodal AI Development :

- Design and develop systems that combine speech recognition, natural language processing, computer vision, and audio analysis.

- Extract and analyze verbal and non-verbal cues : tone, pitch, facial expressions, gestures, posture, and more.

Scoring Engine Design :

- Develop ML models aligned with human scoring rubrics.

- Train on annotated datasets (e.g., expert-evaluated student speeches).

- Ensure high accuracy and explainability in scores across key communication parameters.

Real-Time Feedback System :

- Implement fast, lightweight inference pipelines to deliver instant, structured, and personalized feedback.

- Recommend improvement areas tailored to each learner's current level.

- Work closely with product and curriculum teams to gamify feedback for child engagement.

Model Evaluation & Improvement :

- Establish benchmark metrics and continuous evaluation pipelines.

- Fine-tune models iteratively based on expert feedback and user performance data.

Required Skills & Qualifications

- Strong hands-on experience in Speech AI, Computer Vision, and NLP.

- Proficiency with deep learning frameworks (e.g., PyTorch, TensorFlow, Transformers, OpenCV).

- Experience with multimodal ML, including video/audio synchronization and fusion models.

- Understanding of human communication scoring systems or education-tech applications is a big plus.

- Ability to work with large datasets, and design data annotation/evaluation pipelines.

- Prior experience building real-time or low-latency AI systems.

- Familiarity with edge-device inference or model compression techniques is a plus.

Bonus Points :

- Experience in EdTech or behavioral AI.

- Publications or research in speech analysis, affective computing, or human-AI interaction.

- Contributions to open-source AI projects in related domains.

What We Offer :

- The opportunity to shape the future of education using cutting-edge AI.

- Collaborative and high-growth startup culture with global exposure.

- Access to thousands of hours of expert speech/video data for experimentation.

info-icon

Did you find something suspicious?