Posted on: 25/09/2025
We are building a next-generation Customer Data Platform (CDP) powered by the Databricks Lakehouse architecture and Lakehouse Engine framework. We're looking for a skilled Data Engineer with 4-9 years of experience to help us build metadata-driven pipelines, enable real-time data processing, and support marketing campaign orchestration capabilities at scale.
Responsibilities :
- Configure and extend the Lakehouse Engine framework for batch and streaming pipelines.
- Implement the medallion architecture (Bronze -> Silver -> Gold) using Delta Lake.
- Develop metadata-driven ingestion patterns from various customer data sources.
- Build reusable transformers for PII handling, data standardization, and data quality enforcement.
- Build Spark Structured Streaming pipelines for customer behavior and event tracking.
- Set up Debezium + Kafka for Change Data Capture (CDC) from CRM systems.
- Design and develop identity resolution logic across both streaming and batch datasets.
- Use Unity Catalog for managing RBAC, data lineage, and auditability.
- Integrate Great Expectations or similar tools for continuous data quality monitoring.
- Set up CI/CD pipelines for deploying Databricks notebooks, jobs, and DLT pipelines.
Requirements :
- 4-9 years of hands-on experience in data engineering.
- Expertise in Databricks Lakehouse platform, Delta Lake, and Unity Catalog.
- Advanced PySpark skills, including Structured Streaming.
- Experience implementing Kafka + Debezium CDC pipelines.
- Strong in SQL transformations, data modeling, and analytical querying.
- Familiarity with metadata-driven architecture and parameterized pipelines.
- Understanding of data governance: PII masking, access control, and lineage tracking.
- Proficiency in working with AWS, MongoDB, and PostgreSQL.
- Experience working on Customer 360 or Martech CDP platforms.
- Familiarity with Martech tools like Segment, Braze, or other CDPs.
- Exposure to ML pipelines for segmentation, scoring, or personalization.
- Knowledge of CI/CD for data workflows using GitHub Actions, Terraform, or Databricks CLI.
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1551986