Posted on: 10/01/2026
Description :
Role : Staff Machine Learning Engineer
Job Description :
In this role, you will collaborate closely with product managers, applied scientists, and engineers to design, implement, and scale ML-powered systems that extract and reason over information from complex, unstructured data sourcessuch as documents, web screens, and automation workflows.
Your work will play a key role in enabling intelligent, context-aware assistants and task automation agents that understand user intent and deliver meaningful outcomes.
Key Responsibilities :
- Design and implement scalable Agentic RAG pipelines that orchestrate unstructured data preprocessing, vector store integration, and LLM prompting for accurate, grounded responses.
- Develop modular ML components for layout analysis, information extraction, dialogue management, and tool invocation within multi-turn, goal-driven conversations.
- Integrate conversational agents with enterprise data sources, APIs, and downstream automation workflows to enable end-to-end task execution.
- Craft dynamic prompting and memory strategies to maintain context, reduce hallucination, and improve relevance in long-form or multi-turn queries.
- Collaborate cross-functionally with product managers, UX designers, and platform engineers to define agent behavior and ensure seamless user experiences.
- Monitor and continuously improve system performance, including response quality, retrieval precision, latency, and task success rate using real-world feedback.
- Drive data curation efforts including synthetic generation, annotation workflows, and hard example mining to improve agent robustness and generalization.
- Establish best practices for evaluating agent behavior, including human-in-the-loop review processes, regression testing, and safety guardrails.
- Contribute to infrastructure and MLOps pipelines that support model experimentation, deployment, and monitoring in production environments.
Qualifications / Expectations :
- 7+ years of hands-on experience in building and deploying machine learning models, with a strong focus on document intelligence, NLP, or Generative AI.
- Proven experience developing and productionizing ML systems for unstructured data processing, including document parsing, table extraction, and layout analysis.
- Proficiency with modern ML frameworks such as PyTorch or TensorFlow, and experience with OCR and document processing tools (e.g., Tesseract, AWS Textract, PDFMiner, LayoutParser).
- Strong experience with LLMs and Retrieval-Augmented Generation (RAG) architectures, including practical knowledge of LangChain, LlamaIndex, Haystack, or similar frameworks.
- Understanding on knowledge graphs and graph based information retrieval
- Experience designing and integrating conversational agents with context memory, function/tool calling, and dynamic prompting strategies.
- Familiarity with vector stores (e.g., FAISS, Pinecone, Weaviate) and search/retrieval mechanisms for grounding LLM outputs.
- Experience building ML pipelines and implementing MLOps best practices to support training, validation, deployment, and monitoring of models at scale.
- Hands-on experience with cloud ML services (e.g., AWS SageMaker, Azure ML, Google Vertex AI) for training and inference.
- Solid programming skills in Python, with working knowledge of SQL
- Experience with containerization (Docker), orchestration (Kubernetes), and model serving technologies (e.g., Triton Inference Server, ONNX Runtime, TorchServe) in production environments.
- Knowledge of model optimization techniques (e.g., quantization, pruning, distillation) to improve inference efficiency on cloud or edge devices.
- Strong problem-solving abilities, with a track record of delivering scalable, high-impact ML solutions for complex, real-world problems.
- Excellent communication skills and ability to work autonomously in a fast-paced, collaborative environment.
Nice to Have :
- Experience with dialogue management systems or conversational frameworks (e.g., Rasa, Dialogflow, or custom pipelines) for building intelligent agents.
- Experience with distributed training techniques to optimize large-scale model training across multiple GPUs or cloud environments.
- Understanding of prompt engineering best practices and few-shot or chain-of-thought prompting for improving agent behavior in GenAI systems.
- Background in open-source contributions or research related to document AI, LLM applications, or multi-modal learning.
- Familiarity with CI/CD pipelines for ML, automated model versioning, and monitoring tools for performance and drift in production models.
Did you find something suspicious?