Job Description

Role Summary :


You will own the full ML stack that turns raw dielines, PDFs, and e-commerce images into a self-learning system that reads, reasons about, and designs packaging artwork.

That includes :

- Building data-ingestion & annotation pipelines (SVG/PDF - JSON),

- Designing / modifying model heads on top of LayoutLM-v3, CLIP, GNNs, diffusion LoRAs,

- Training & fine-tuning on GPUs,

- Shipping inference APIs and evaluation dashboards.

- You'll work day-to-day with packaging designers and a product-manager; you are the technical authority on everything deep-learning for this domain.

Key Responsibilities :

Area Tasks :


- Data & Pre-processing (- 40 %) - Write robust Python scripts to parse PDF, AI, SVG; extract text, colour separations, images, panel polygons.

- Implement Ghostscript, Tesseract, YOLO, CLIP pipelines.

- Automate synthetic-copy generation for ECMA dielines.

- Maintain vocabulary YAMLs & JSON schemas.

- Model R-&-D (- 40 %) - Modify LayoutLM-v3 heads (panel-ID, bbox-reg, colour, contrastive).

- Build panel-encoder pre-train (mask-panel prediction).

- Add Graph-Transformer & CLIP-retrieval heads; optional diffusion generator.

- Run experiments, hyper-param sweeps, ablations; track KPIs (IoU, panel-F1, colour recall).

MLOps & Deployment (- 20 %) - Package training & inference into Docker/SageMaker or GCP Vertex jobs.

- Maintain CI/CD, experiment tracking (Weights&Biases, MLflow).

- Serve REST/GraphQL endpoints that designers and the web front-end call.

- Implement active-learning loop that ingests designer corrections nightly.

Must-Have Qualifications :


- 5 + years Python, 3 + years deep-learning (PyTorch, Hugging Face).

- Hands-on with Transformer-based vision-language models (e.g. LayoutLM, Pix2Struct) and at least one object-detection pipeline (YOLOv5/8, DETR).

- Comfortable hacking PDF/SVG tool-chains: PyMuPDF/pdfplumber, Ghostscript, svgpathtools, OpenCV.

- Experience designing custom heads / loss functions and fine-tuning large pre-trained checkpoints on limited data.

- Solid Linux & GPU know-how; can spin up, monitor, and profile multi-GPU jobs.

- Familiarity with graph neural networks or relational transformers.

- Clear, idiomatic Git & code-review discipline; writes reproducible experiments.

Nice-to-Have :


- Knowledge of colour science (Lab, ICC profiles, Pantone tables) or print production.

- Prior work on multimodal retrieval (CLIP, ImageBind) or diffusion fine-tuning (LoRA, ControlNet).

- Packaging / CPG industry exposure (Nutrition Facts, Drug Facts, ECMA codes).

- Experience standing up FAISS or similar vector search, and with AWS/GCP ML tooling.

- Familiarity with Typescript/React front-ends for quick label-preview UIs.

Tool Stack You'll Own :


- Domain Primary tools

- DL frameworks PyTorch, Hugging Face Transformers, torch-geometric

- Parsing / CV PyMuPDF, pdfplumber, svgpathtools, OpenCV, Ghostscript

- OCR / Detectors Tesseract, YOLOv8, Grounding DINO (optional)

- Retrieval CLIP / ImageBind + FAISS

- MLOps Docker, GitHub Actions, W&B or MLflow, AWS SageMaker / GCP Vertex

- Languages 95 % Python, occasional Bash / JSON / YAML

Deliverables in the First 6 Months :


1. Data pipeline v1 that converts > 500 ECMA dielines + 200 PDFs into training-ready JSON.

2. Panel-encoder checkpoint with < 5 % masked-panel error.

3. MVP copy-placement model (LayoutLM-v3 backbone + heads) hitting - 85 % IoU on validation.

4. REST inference service + designer preview UI able to draft lid/side-wrap artwork for one SKU in < 10 s.

5. Nightly active-learning retrain loop.

Reporting & Team :


- Reports to Head of AI (or CTO).

- Collaborates with 1 front-end engineer, 1 product manager, 2 packaging-design SMEs.

info-icon

Did you find something suspicious?