Posted on: 02/12/2025
Job Title : Senior AI/ML Engineer (Document Intelligence & Generative AI)
Location : Chennai - Onsite - Hybrid
Notice Period : Joining within 30 Days only
Job Summary :
We are seeking a Senior AI/ML Engineer with 5-8+ years of experience to lead the development of our next-generation document processing platform. In this role, you will move beyond simple predictive modeling to design end-to-end intelligent systems that "read" and understand complex documents (PDFs, Images, Excel, PPT).
You will architect and deploy high-performance pipelines combining OCR, Document Layout Analysis, Vision-Language Models (VLMs), and Large Language Models (LLMs). You will be responsible not just for model experimentation, but for the full lifecycle-from architecture and quantization to scaling and production deployment.
Key Responsibilities :
1. System Architecture & Orchestration :
- Design end-to-end ML system architectures that orchestrate multiple components : OCR engines, embedding models, Vector DBs, and LLM prompt pipelines.
- Build robust preprocessing pipelines for diverse unstructured data formats (Images, PDFs, PPTs, Excel).
- Design async processing queues and microservices to handle high-concurrency workloads.
2. Advanced Model Implementation :
- OCR & Layout Analysis : Implement and fine-tune pipelines using tools like Tesseract, PaddleOCR, or cloud APIs (AWS Textract, Google Vision). Deploy document layout models like LayoutLMv3, Donut, or LiLT.
- Vision-Language Models : Leverage state-of-the-art VLMs (e.g., CLIP, BLIP-2, LLaVA, Florence-2) for multimodal understanding.
- Generative AI & LLMs : Implement information extraction using OpenAI/Claude APIs or host open-source models (Llama 3, Qwen, Mistral) for private deployments.
3. Performance Optimization & Scaling :
- Optimize ML systems for production scaling, ensuring low latency and high throughput for batch processing.
- Apply model optimization techniques, including distillation and quantization (GGUF, ONNX) for efficient CPU/GPU inference.
- Implement deployment strategies that balance cost and performance across GPU and CPU resources.
4. MLOps & Production Engineering :
- Build CI/CD pipelines for ML (using GitHub Actions, Jenkins) to automate testing and deployment.
- Implement monitoring solutions to track concept drift, latency, and inference costs.
- Develop and expose models via robust REST APIs.
Required Technical Qualifications :
- Experience :
Did you find something suspicious?