HamburgerMenu
hirist

STG Labs - Senior Machine Learning Engineer - Python/NLP

Posted on: 20/11/2025

Job Description

Description :


- Owns ongoing model performance and enhancement for 1 or more Dodge entities / domains


- Deeply analyzes Dodge datasets in order to suggest best solutions for data management and enrichment using AI / ML


- Design, develop, and test machine learning models to automate data enrichment, classification, and validation processes.


- Develop Python-based automation scripts and microservices to reduce manual effort in project matching, contact discovery, and quality checks.


- Implement NLP models for entity recognition (e.g., identifying architects, GCs, and project roles from unstructured text, pdf documents).


- Implement OCR, NLP, and layout recognition techniques to extract project metadata, deadlines, contacts, and technical requirements.


- Build Python-based scripts and microservices to classify documents by type and extract structured fields (e.g., bid dates, scope of work, etc).


- Build pipelines that integrate scraped project data with external APIs (ZoomInfo, LinkedIn, etc.) to enrich company and contact information.


- Collaborate with data engineers to ensure ML pipelines integrate seamlessly with existing data warehouses.


- Partner with data specialists to design feedback loops that validate and improve model outputs.

Required Qualifications :


- 5+ years of experience in Machine Learning and automation engineering.


- Proficiency in Python with hands-on experience using libraries such as scikit-learn, spaCy, TensorFlow,

or PyTorch.


- Hands-on experience with OCR frameworks (Tesseract, PaddleOCR, AWS Textract, Google Document AI).


- Familiarity with document layout analysis (LayoutLM, Donut, DocTR, etc.).


- Strong knowledge of regex, rules-based parsing, and entity extraction techniques.


- Strong knowledge of data pipelines and ETL frameworks.


- Experience deploying ML models into production, monitoring performance, and maintaining pipelines.


- Solid understanding of relational databases and SQL; experience with large-scale warehouses (e.g.,

Redshift, Snowflake).


- Demonstrated experience automating repetitive tasks with Python, APIs, and workflow orchestration.


- Strong problem-solving skills with the ability to translate business use cases (project/contact enrichment, validation) into ML/automation solutions.

Preferred Qualifications :


- Experience with Named Entity Recognition (NER) and text classification models for parsing unstructured construction/project documents.


- Familiarity with AWS analytics/ML services (SageMaker, Comprehend, Lambda, Step Functions).


- Exposure to CI/CD pipelines and MLOps tools (MLflow, Git, Docker, Kubernetes).


- Prior experience working with sales intelligence data (contacts, companies, lead enrichment).


- Experience in Agile delivery environments using Jira or Confluence.

Mode of Work : Hybrid - This role adheres to the organizations hybrid work policy, requiring presence at the office on three designated days per week


info-icon

Did you find something suspicious?