Posted on: 29/10/2025
Description :
Position Overview :
Were hiring an AI Lead Engineer to architect, ship, and scale production-grade Computer Vision with a focused GenAI charter.
Youll lead 610 ML/CV engineers to deliver high-accuracy, low-latency video analyticsdetection, tracking, segmentation, recognition, re-identificationtackling multi-camera tracking, 24/7 streaming at scale, long-tail drift, and edge optimization on GPUs/Jetson, while partnering with Product/Platform to meet FPS, latency, accuracy, and cost SLAs.
The day-to-day centers on core CV model design, training/leval, deployment, and streaming/edge performance, complemented by GenAI that amplifies the stack : integrating VLM/LLM capabilities for natural-language video search, incident summarization, operator Q&A, and copilot workflows; standing up RAG over video/sensor/metadata embeddings with robust prompts, tooling, evals, and guardrails; and driving data ops with synthetic data, auto-labeling, active-learning triage with privacy, safety, and cost controls.
What Youll Own :
End-to-end delivery of CV products : problem framing / data/labeling / model design /
optimization / deployment/ monitoring / iteration.
Technical roadmap & architecture for video analytics pipelines (ingest / decode / infer / track / post-process / store/serve).
Team leadership : mentoring, hiring input, OKRs, code/research standards, and performance coaching.
Key Responsibilities :
Technical Leadership & Execution :
measurable CV objectives and SLAs/SLOs (e.g., mAP/IDF1, per-frame latency, dropped-frame
rate, cost/stream).
- Lead design reviews; establish MLOps and coding standards; enforce experiment tracking, reproducibility, and dataset/version governance.
- Drive capacity planning, GPU/Jetson utilization, batching/windowing strategy, autoscaling, and cost governance.
Computer Vision R&D (Detection, Tracking, Segmentation, Recognition) :
- Deliver production models for object detection/segmentation/classification/tracking (e.g., YOLOv8/v9,
Mask R-CNN/Mask2Former, EfficientNet/ConvNeXt, ByteTrack/OC-SORT/DeepSORT).
- Build person/vehicle/product re-ID, face/attribute recognition (e.g., ArcFace/CosFace), OCR (e.g., PP-OCR), and keypoint/action recognition (e.g., MMPose, SlowFast/X3D).
- Tackle domain adaptation, class imbalance, and occlusions; design augmentations and semi-supervised/active learning loops to harvest hard negatives.
Video Analytics & Edge Inference :
- Optimize with TensorRT/ONNX Runtime/Torch-TensorRT, INT8 calibration, pruning/distillation; leverage Jetson Orin/Xavier/Nano and DLA where applicable.
- Design multi-camera fusion, homography/camera calibration, and cross-camera ID consistency for retail, traffic, manufacturing, and security use cases.
- Implement privacy-by-design features (e.g., face/license blur, PII redaction).
Generative AI & LLMs :
- Design agents with LangChain/LangGraph; implement tool-use, safety filters, and guardrails; add evaluation loops (e.g., RAGAS/DeepEval-style).
Platform, Serving & MLOps :
- Ship services via FastAPI/Flask; containerize with Docker; orchestrate on Kubernetes (KServe) or AWS SageMaker/Vertex AI.
- Build high-throughput inference with Triton Inference Server (dynamic batching, concurrent models, model ensembles).
- Streaming & storage : RTSP/RTMP ingest, Kafka/Kinesis, object storage + time-series DB; index/frame-level metadata for search and analytics.
- CI/CD with MLflow/DVC (artifacts, model registry), unit/integration tests, and rollout strategies (canary, shadow).
Observability, Drift & Governance :
- Model observability : data/feature drift, concept drift on detections/tracks, re-ID distribution shifts, outlier/novelty detection, safety metrics.
- Human-in-the-loop review tools (CVAT/Label Studio) and auto-retraining triggers; maintain model cards, evaluation reports, versioned prompts/configs, and auditability.
- Ensure compliance and privacy/PII handling; ONVIF/edge security best practices.
Cross-Functional & People Leadership :
- Lead and grow a 610 person CV team; foster a culture of high-quality experiments, rigorous reviews, and measurable impact.
- Communicate progress/risks to executives with clear, metric-driven updates and customer-facing results.
Required Qualifications :
- 5-8 years in ML/Computer Vision with 2+ years leading 610 engineers delivering production
video analytics.
- Proven track record shipping detection/tracking/segmentation/recognition systems with business impact (accuracy, latency, cost).
Strong in at least one per category (and comfortable across most) :
- CV Frameworks : PyTorch (preferred) or TensorFlow; OpenCV, NVIDIA DeepStream/GStreamer.
Serving/Runtime : Triton Inference Server, FastAPI/Flask.
- Optimization : TensorRT, ONNX Runtime, Torch-TensorRT; quantization/pruning/distillation; INT8 calibration.
- Tracking/Re-ID/OCR : ByteTrack/OC-SORT/DeepSORT; ArcFace/CosFace; PP-OCR/Tesseract.
- Agents & Retrieval : LangChain or LangGraph; Pinecone/ChromaDB/Milvus
- MLOps : Docker, Kubernetes (KServe/SageMaker/Vertex AI), MLflow/DVC.
- Cloud : AWS (SageMaker, EC2/EKS) or GCP (Vertex AI) or Azure ML (AKS).
- Programming : Python (expert); C++ or Go for perf-critical components; CUDA fundamentals a plus.
- Streaming/IO : RTSP/RTMP, Kafka/Kinesis/Rabbitmq; ONVIF familiarity.
- Strong system design (multi-stream pipelines, GPU scheduling, distributed tracking/indexing) and excellent communication.
Preferred Qualifications :
- Experience with multi-camera tracking, calibration and zone-based analytics.
Jetson fleet management :
- Ray/Ray Serve or Kubeflow; feature stores (Feast); complex event processing.
- Domain experience in one or more : retail analytics, traffic/ADAS, manufacturing QA,
security/safety, sports analytics.
- Open-source contributions, patents, or publications in CV/video analytics.
(Nice to have) Multimodal exposureCLIP/SigLIP, SAM/Mask2Former, or VLMs for captioning/searchused sparingly to support CV workflows
Did you find something suspicious?