Youll lead 610 ML/CV engineers to deliver high-accuracy, low-latency video analyticsdetection, tracking, segmentation, recognition, re-identificationtackling multi-camera tracking, 24/7 streaming at scale, long-tail drift, and edge optimization on GPUs/Jetson, while partnering with Product/Platform to meet FPS, latency, accuracy, and cost SLAs.

The day-to-day centers on core CV model design, training/leval, deployment, and streaming/edge performance, complemented by GenAI that amplifies the stack : integrating VLM/LLM capabilities for natural-language video search, incident summarization, operator Q&A, and copilot workflows; standing up RAG over video/sensor/metadata embeddings with robust prompts, tooling, evals, and guardrails; and driving data ops with synthetic data, auto-labeling, active-learning triage with privacy, safety, and cost controls.

What Youll Own :

End-to-end delivery of CV products : problem framing / data/labeling / model design /

optimization / deployment/ monitoring / iteration.

Technical roadmap & architecture for video analytics pipelines (ingest / decode / infer / track / post-process / store/serve).

Team leadership : mentoring, hiring input, OKRs, code/research standards, and performance coaching.

Key Responsibilities :

Technical Leadership & Execution :

- Translate business goals (e.g., reduce shrinkage, increase throughput, improve safety) into

measurable CV objectives and SLAs/SLOs (e.g., mAP/IDF1, per-frame latency, dropped-frame

rate, cost/stream).

- Lead design reviews; establish MLOps and coding standards; enforce experiment tracking, reproducibility, and dataset/version governance.

- Drive capacity planning, GPU/Jetson utilization, batching/windowing strategy, autoscaling, and cost governance.

Computer Vision R&D (Detection, Tracking, Segmentation, Recognition) :

- Deliver production models for object detection/segmentation/classification/tracking (e.g., YOLOv8/v9,

Mask R-CNN/Mask2Former, EfficientNet/ConvNeXt, ByteTrack/OC-SORT/DeepSORT).

- Build person/vehicle/product re-ID, face/attribute recognition (e.g., ArcFace/CosFace), OCR (e.g., PP-OCR), and keypoint/action recognition (e.g., MMPose, SlowFast/X3D).

- Tackle domain adaptation, class imbalance, and occlusions; design augmentations and semi-supervised/active learning loops to harvest hard negatives.

Video Analytics & Edge Inference :

- Architect real-time pipelines using NVIDIA DeepStream/GStreamer/OpenCV; optimize decode (NVDEC), pre/post, and trackers for 3060 FPS at 1080p.

- Optimize with TensorRT/ONNX Runtime/Torch-TensorRT, INT8 calibration, pruning/distillation; leverage Jetson Orin/Xavier/Nano and DLA where applicable.

- Design multi-camera fusion, homography/camera calibration, and cross-camera ID consistency for retail, traffic, manufacturing, and security use cases.

- Implement privacy-by-design features (e.g., face/license blur, PII redaction).

Generative AI & LLMs :

- Architect robust RAG : retrieval pipelines with Pinecone/ChromaDB/Milvus; index sharding/compaction; freshness policies; hybrid search.

- Design agents with LangChain/LangGraph; implement tool-use, safety filters, and guardrails; add evaluation loops (e.g., RAGAS/DeepEval-style).

Platform, Serving & MLOps :

- Ship services via FastAPI/Flask; containerize with Docker; orchestrate on Kubernetes (KServe) or AWS SageMaker/Vertex AI.

- Build high-throughput inference with Triton Inference Server (dynamic batching, concurrent models, model ensembles).

- Streaming & storage : RTSP/RTMP ingest, Kafka/Kinesis, object storage + time-series DB; index/frame-level metadata for search and analytics.

- CI/CD with MLflow/DVC (artifacts, model registry), unit/integration tests, and rollout strategies (canary, shadow).

Observability, Drift & Governance :

- Production monitoring with Prometheus/Grafana; per-stage latency, FPS, GPU memory/SM occupancy, dropped frames, and backpressure.

- Model observability : data/feature drift, concept drift on detections/tracks, re-ID distribution shifts, outlier/novelty detection, safety metrics.

- Human-in-the-loop review tools (CVAT/Label Studio) and auto-retraining triggers; maintain model cards, evaluation reports, versioned prompts/configs, and auditability.

- Ensure compliance and privacy/PII handling; ONVIF/edge security best practices.

Cross-Functional & People Leadership :

- Partner with Product/SRE/DevOps on roadmaps, SLAs, incident response runbooks, and cost/perf tradeoffs.

- Lead and grow a 610 person CV team; foster a culture of high-quality experiments, rigorous reviews, and measurable impact.

- Communicate progress/risks to executives with clear, metric-driven updates and customer-facing results.

Required Qualifications :

- Candidate shall have a degree in B.E/B.Tech/MCA in any discipline preferable computer science

- 5-8 years in ML/Computer Vision with 2+ years leading 610 engineers delivering production

video analytics.

- Proven track record shipping detection/tracking/segmentation/recognition systems with business impact (accuracy, latency, cost).

Strong in at least one per category (and comfortable across most) :

- CV Frameworks : PyTorch (preferred) or TensorFlow; OpenCV, NVIDIA DeepStream/GStreamer.

Serving/Runtime : Triton Inference Server, FastAPI/Flask.

- Optimization : TensorRT, ONNX Runtime, Torch-TensorRT; quantization/pruning/distillation; INT8 calibration.

- Tracking/Re-ID/OCR : ByteTrack/OC-SORT/DeepSORT; ArcFace/CosFace; PP-OCR/Tesseract.

- Agents & Retrieval : LangChain or LangGraph; Pinecone/ChromaDB/Milvus

- MLOps : Docker, Kubernetes (KServe/SageMaker/Vertex AI), MLflow/DVC.

- Cloud : AWS (SageMaker, EC2/EKS) or GCP (Vertex AI) or Azure ML (AKS).

- Programming : Python (expert); C++ or Go for perf-critical components; CUDA fundamentals a plus.

- Streaming/IO : RTSP/RTMP, Kafka/Kinesis/Rabbitmq; ONVIF familiarity.

- Strong system design (multi-stream pipelines, GPU scheduling, distributed tracking/indexing) and excellent communication.

Preferred Qualifications :

- Operating vector or time-series stores for video metadata at 10M50M+ rows; search over tracks, IDs, and events.

- Experience with multi-camera tracking, calibration and zone-based analytics.

Jetson fleet management :

- Ray/Ray Serve or Kubeflow; feature stores (Feast); complex event processing.

- Domain experience in one or more : retail analytics, traffic/ADAS, manufacturing QA,

security/safety, sports analytics.

- Open-source contributions, patents, or publications in CV/video analytics.

(Nice to have) Multimodal exposureCLIP/SigLIP, SAM/Mask2Former, or VLMs for captioning/searchused sparingly to support CV workflows

Did you find something suspicious?

Posted By

Auriga IT HR

HR at Auriga IT Consulting Pvt Ltd.

Last Active: 5 Dec 2025

Job Views:
9

Applications: 6

Recruiter Actions: 0

Posted in

AI/ML

Functional Area

ML / DL Engineering

Job Code

1567053

Jobs by location

Interview Questions for you

View All

Top 25 LLM Interview Questions and Answers

Top 50+ GitHub Interview Questions and Answers

Top 25+ Database Testing Interview Questions and Answers