Job Title : Senior AI/ML Engineer (Document Intelligence & Generative AI)

Location : Chennai - Onsite - Hybrid

Notice Period : Joining within 30 Days only

Job Summary :

We are seeking a Senior AI/ML Engineer with 5-8+ years of experience to lead the development of our next-generation document processing platform. In this role, you will move beyond simple predictive modeling to design end-to-end intelligent systems that "read" and understand complex documents (PDFs, Images, Excel, PPT).

You will architect and deploy high-performance pipelines combining OCR, Document Layout Analysis, Vision-Language Models (VLMs), and Large Language Models (LLMs). You will be responsible not just for model experimentation, but for the full lifecycle-from architecture and quantization to scaling and production deployment.

Key Responsibilities :

1. System Architecture & Orchestration :

- Design end-to-end ML system architectures that orchestrate multiple components : OCR engines, embedding models, Vector DBs, and LLM prompt pipelines.

- Build robust preprocessing pipelines for diverse unstructured data formats (Images, PDFs, PPTs, Excel).

- Design async processing queues and microservices to handle high-concurrency workloads.

2. Advanced Model Implementation :

- OCR & Layout Analysis : Implement and fine-tune pipelines using tools like Tesseract, PaddleOCR, or cloud APIs (AWS Textract, Google Vision). Deploy document layout models like LayoutLMv3, Donut, or LiLT.

- Vision-Language Models : Leverage state-of-the-art VLMs (e.g., CLIP, BLIP-2, LLaVA, Florence-2) for multimodal understanding.

- Generative AI & LLMs : Implement information extraction using OpenAI/Claude APIs or host open-source models (Llama 3, Qwen, Mistral) for private deployments.

3. Performance Optimization & Scaling :

- Optimize ML systems for production scaling, ensuring low latency and high throughput for batch processing.

- Apply model optimization techniques, including distillation and quantization (GGUF, ONNX) for efficient CPU/GPU inference.

- Implement deployment strategies that balance cost and performance across GPU and CPU resources.

4. MLOps & Production Engineering :

- Build CI/CD pipelines for ML (using GitHub Actions, Jenkins) to automate testing and deployment.

- Implement monitoring solutions to track concept drift, latency, and inference costs.

- Develop and expose models via robust REST APIs.

Required Technical Qualifications :

- Experience :

5-8+ years of hands-on experience in Machine Learning / Deep Learning.
At least 2-3 years in a Senior or Lead role actively building production systems.

- Core Programming :

Expert proficiency in Python and ML frameworks (PyTorch, TensorFlow).

- Document Intelligence Stack (Must Have) :

OCR : Tesseract, EasyOCR, PaddleOCR, AWS Textract, or Google Vision.
Layout Models : LayoutLMv3, Donut, Pix2Struct, LiLT, LayoutXLM.
VLMs : CLIP, BLIP-2, GroundingDINO, LLaVA, Florence-2.

- Generative AI Stack :

Experience with LLMs (OpenAI, Gemini, Llama, Mistral).
Strong knowledge of Prompt Engineering, RAG (Retrieval-Augmented Generation), and Vector Databases.

- Engineering & Deployment :

Proficiency in API development (FastAPI/Flask).
Experience with Docker, container orchestration, and async task queues (Celery/Kafka).
Experience optimizing models (ONNX, GGUF, Quantization).

Preferred Skills :

- Cloud Platforms : Experience with AWS (SageMaker, Bedrock), GCP (Vertex AI), or Azure.

- MLOps : Experience setting up model monitoring for drift and cost optimization.

- Problem Solving : Ability to estimate feasibility, complexity, and cost of ML solutions.

- Ownership : Proven track record of taking vague product problems and converting them into clear, actionable ML tasks and production-ready systems.

Education :

- Required : Bachelor's Degree in Computer Science, Data Science, or related field.

- Preferred : Master's Degree or specialized Certification Programs in AI/Deep Learning.