Posted on: 03/06/2025
Role Summary :
You will own the full ML stack that turns raw dielines, PDFs, and e-commerce images into a self-learning system that reads, reasons about, and designs packaging artwork.
That includes :
- Building data-ingestion & annotation pipelines (SVG/PDF - JSON),
- Designing / modifying model heads on top of LayoutLM-v3, CLIP, GNNs, diffusion LoRAs,
- Training & fine-tuning on GPUs,
- Shipping inference APIs and evaluation dashboards.
- You'll work day-to-day with packaging designers and a product-manager; you are the technical authority on everything deep-learning for this domain.
Key Responsibilities :
Area Tasks :
- Data & Pre-processing (- 40 %) - Write robust Python scripts to parse PDF, AI, SVG; extract text, colour separations, images, panel polygons.
- Implement Ghostscript, Tesseract, YOLO, CLIP pipelines.
- Automate synthetic-copy generation for ECMA dielines.
- Maintain vocabulary YAMLs & JSON schemas.
- Model R-&-D (- 40 %) - Modify LayoutLM-v3 heads (panel-ID, bbox-reg, colour, contrastive).
- Build panel-encoder pre-train (mask-panel prediction).
- Add Graph-Transformer & CLIP-retrieval heads; optional diffusion generator.
- Run experiments, hyper-param sweeps, ablations; track KPIs (IoU, panel-F1, colour recall).
MLOps & Deployment (- 20 %) - Package training & inference into Docker/SageMaker or GCP Vertex jobs.
- Maintain CI/CD, experiment tracking (Weights&Biases, MLflow).
- Serve REST/GraphQL endpoints that designers and the web front-end call.
- Implement active-learning loop that ingests designer corrections nightly.
Must-Have Qualifications :
- 5 + years Python, 3 + years deep-learning (PyTorch, Hugging Face).
- Hands-on with Transformer-based vision-language models (e.g. LayoutLM, Pix2Struct) and at least one object-detection pipeline (YOLOv5/8, DETR).
- Comfortable hacking PDF/SVG tool-chains: PyMuPDF/pdfplumber, Ghostscript, svgpathtools, OpenCV.
- Experience designing custom heads / loss functions and fine-tuning large pre-trained checkpoints on limited data.
- Solid Linux & GPU know-how; can spin up, monitor, and profile multi-GPU jobs.
- Familiarity with graph neural networks or relational transformers.
- Clear, idiomatic Git & code-review discipline; writes reproducible experiments.
Nice-to-Have :
- Knowledge of colour science (Lab, ICC profiles, Pantone tables) or print production.
- Prior work on multimodal retrieval (CLIP, ImageBind) or diffusion fine-tuning (LoRA, ControlNet).
- Packaging / CPG industry exposure (Nutrition Facts, Drug Facts, ECMA codes).
- Experience standing up FAISS or similar vector search, and with AWS/GCP ML tooling.
- Familiarity with Typescript/React front-ends for quick label-preview UIs.
Tool Stack You'll Own :
- Domain Primary tools
- DL frameworks PyTorch, Hugging Face Transformers, torch-geometric
- Parsing / CV PyMuPDF, pdfplumber, svgpathtools, OpenCV, Ghostscript
- OCR / Detectors Tesseract, YOLOv8, Grounding DINO (optional)
- Retrieval CLIP / ImageBind + FAISS
- MLOps Docker, GitHub Actions, W&B or MLflow, AWS SageMaker / GCP Vertex
- Languages 95 % Python, occasional Bash / JSON / YAML
Deliverables in the First 6 Months :
1. Data pipeline v1 that converts > 500 ECMA dielines + 200 PDFs into training-ready JSON.
2. Panel-encoder checkpoint with < 5 % masked-panel error.
3. MVP copy-placement model (LayoutLM-v3 backbone + heads) hitting - 85 % IoU on validation.
4. REST inference service + designer preview UI able to draft lid/side-wrap artwork for one SKU in < 10 s.
5. Nightly active-learning retrain loop.
Reporting & Team :
- Reports to Head of AI (or CTO).
- Collaborates with 1 front-end engineer, 1 product manager, 2 packaging-design SMEs.
Did you find something suspicious?
Posted By
Posted in
AI/ML
Functional Area
ML / DL Engineering
Job Code
1489929
Interview Questions for you
View All