Description :

We are seeking a highly motivated and innovative Applied Scientist to join our research team in Mumbai, India, to drive cutting-edge research and development. In this role, you will spearhead the development of intelligent computer vision solutions leveraging real-time video data, vision language models (VLLMSs), and advanced architectures. Your work will focus on solving complex real-world problems using multimodal learning, video understanding, self-supervised techniques, and LLM-enhanced vision models. You will have the opportunity to work at the intersection of vision, language, and reasoning, driving research and innovation from ideation through deployment.

Responsibilities :

- Research and build state-of-the-art computer vision systems with a focus on real-time video analytics, video summarisation, object tracking, and activity recognition.

- Develop and apply Vision-Language Models (VLMs) and multimodal transformer architectures for deep semantic understanding of visual content.

- Apply self-supervised, zero-shot, and few-shot learning techniques to enhance model generalisation across varied video domains.

- Explore and optimise LLM prompting strategies and cross-modal alignment methods for improved reasoning over vision data.

- Contribute to research publications, patents, and internal IP assets in the area of vision and multimodal AI.

Requirements :

- Master's in Computer Science, Computer Vision, Machine Learning, or a related discipline with 2+ years of experience leading applied research or product-focused CV/ML projects.

- Expertise in modern computer vision architectures (e. g., ViT, SAM, CLIP, BLIP, DETR, or similar).

- Experience with Vision-Language Models (VLMs) and multimodal AI systems.

- Strong background in real-time video analysis, including event detection, motion analysis, and temporal reasoning.

- Experience with transformer-based architectures, multimodal embeddings, and LLM-vision integrations.

- Proficiency in Python and deep learning libraries like PyTorch or TensorFlow, OpenCV.

- Experience with cloud platforms (AWS, Azure) and deployment frameworks (ONNX, TensorRT) is a plus.

- Strong problem-solving skills, with a track record of end-to-end ownership of applied ML/CV projects.

- Excellent communication and collaboration skills, with the ability to work in cross-functional teams.