Posted on: 04/04/2026
Description :
Role Summary :
We are hiring a Product Data Architect to own the data architecture and data foundations for a portfolio of strategic AI/ML products. This role focuses on designing and building product-grade data pipelines, curated data layers, and fit-for-purpose data stores that power analytics and AI/ML use cases.
You will define data models, data contracts, quality and governance controls, and access patterns to ensure product teams can reliably consume high-quality data at scale.
Key Responsibilities :
- Design the end-to-end data architecture for assigned products: ingestion transformation curated layers serving/consumption.
- Build and oversee data pipelines (batch/stream where needed), including orchestration, error handling, recovery, and performance optimization.
- Define product-level data models (conceptual/logical/physical), including dimensional models, canonical entities, and domain schemas.
- Establish data contracts with upstream/downstream systems and product services (schemas, SLAs, validation rules, versioning).
- Implement and enforce data quality and observability: checks, anomaly detection, freshness/completeness, reconciliation, and alerting.
- Define master/reference data needs and harmonization approaches for product-specific domains.
- Ensure secure and compliant data handling: access control, PII masking/redaction, encryption standards, retention, and auditability.
- Partner with Data Engineering, ML/AI teams, and Product/Tech leads to enable use cases such as forecasting, pricing optimization, RAG/knowledge bases, and experimentation.
- Evaluate and recommend data tooling choices at the product layer (e.g., transformations, orchestration, streaming, serving stores) aligned to scalability and cost.
Key Skills :
1) Product Data Architecture & Modelling
- Strong experience designing product-oriented data architectures: domain boundaries, source-to-consumption flows, and curated layers.
- Expertise in data modelling (dimensional, normalized, hybrid) and defining canonical datasets for product use cases.
- Ability to design data products: clearly defined datasets with ownership, contracts, documentation, and usage SLAs.
2) Data Pipeline & Lake/Lakehouse Design
- Hands-on architecture of data pipelines (batch and near-real-time): ingestion, transformation, orchestration, and serving.
- Strong understanding of data lake/lakehouse patterns: bronze/silver/gold, CDC-based ingestion, incremental processing, partitioning, and compaction strategies.
- Ability to define scalable approaches for data integration from enterprise systems (ERP/CRM/MarTech/R&D/LIMS/manufacturing systems, files, APIs, event streams).
3) Data Quality, Governance & Observability
- Proven capability to implement data quality frameworks: validation rules, anomaly detection, reconciliation, and completeness/accuracy checks.
- Strong understanding of metadata, lineage, and cataloging, and how to make data discoverable and trustworthy.
- Experience defining and enforcing data access controls: classification, role-based access, masking/tokenization, auditability.
4) Performance, Reliability & Cost-Aware Design
- Expertise in designing performant datasets and pipelines: partitioning, clustering, indexing, query optimization, and workload management.
- Ability to define operational standards for pipelines: retries, idempotency, backfills, monitoring, alerting, and incident response.
- Cost/performance tradeoff thinking for storage and compute (especially for large-scale transformation workloads).
5) Integration with AI/Analytics Consumption
- Strong understanding of downstream needs for BI/analytics, ML feature engineering, and AI applications (including GenAI/RAG where relevant).
- Ability to shape datasets for consumption: feature-ready tables, semantic layers, and curated marts for product teams.
6) Cross-Functional Delivery & Stakeholder Management
- Ability to work closely with product teams, data engineers, platform teams, and security/compliance to deliver on product timelines.
- Strong documentation and communication: data lineage, source mapping, data dictionaries, and pipeline runbooks.
Skills Required :
- Experience with modern orchestration and transformation tooling (e.g., Airflow/Prefect, dbt, or equivalents).
- Familiarity with one or more ecosystems commonly used in enterprise data platforms (e.g., Spark/Databricks, Snowflake/BigQuery, Delta/Iceberg/Hudi).
- Exposure to master data management, reference data management, and consent/PII governance programs.
- Domain exposure in CPG/FMCG, pricing/revenue management, marketing/media analytics, supply chain forecasting, or R&D systems.
Qualifications :
- Bachelors/Masters in Computer Science, Engineering, or related discipline (or equivalent practical experience).
- 612 years of experience in roles such as Data Architect, Analytics Architect, Data Engineering Lead, or Data Platform Architect with demonstrable ownership of data models and pipeline architectures for business-critical products.
Did you find something suspicious?