Posted on: 11/02/2026
Description :
Role Overview :
As Data Engineering Lead , you will serve as the architect and technical leader for our entire data platformspanning real-time ingestion, large-scale processing, analytics, and machine-learning enablement.
This role demands deep hands-on expertise in Spark and Databricks, strong data warehouse architecture skills, and the ability to lead and mentor engineers while shaping the long-term data strategy of an AI-driven SaaS platform.
You will work at the intersection of data engineering, analytics, machine learning, and data quality, ensuring the platform is reliable, scalable, and ML-ready from day one.
Key Responsibilities :
Platform & Architecture Leadership :
- Own end-to-end data platform architecture : ingestion, processing, and warehouse semantic layer ML consumption
- Design and govern Spark-based processing architectures (batch and streaming)
- Lead implementation of Databricks Lakehouse patterns (Delta Lake, medallion architecture, optimized compute)
- Define data warehouse architecture supporting analytics, BI, and AI workloads
- Establish standards for scalability, performance, and cost optimization
Spark & Databricks Engineering :
- Architect and optimize Apache Spark jobs (PySpark / Spark SQL)
- Lead design of efficient joins, partitioning, caching, and performance tuning
- Implement structured streaming pipelines where required
- Guide team on Databricks best practices : cluster configuration, job orchestration, notebooks vs pipelines
- Ensure reliable handling of high-volume, multi-tenant data
Data Modeling & Warehouse Design :
- Design dimensional, analytical, and semantic data models
- Define fact, dimension, and feature-ready tables
- Ensure alignment between warehouse design and downstream ML use cases
- Partner with Data Architects on conceptual and logical models
- Ensure query performance and analytical usability at scale
Machine Learning Enablement :
- Design pipelines that support feature engineering, model training, and inference
- Enable consistent, versioned, and reproducible datasets for ML workflows
- Collaborate with Data Science to operationalize models in production
- Support offline and near-real-time ML data needs
Data Validation & Quality Leadership :
- Define data validation and quality frameworks embedded into pipelines
- Lead implementation of checks for accuracy, completeness, timeliness, and consistency
- Partner with Data Validation and QA teams on quality standards
- Drive root-cause analysis and prevention of data defects
- Ensure trustworthiness of analytics and ML outputs
Team Leadership & Collaboration :
- Lead, mentor, and grow a team of data engineers and interns
- Conduct design reviews, code reviews, and architecture walkthroughs
- Guide engineers on best practices across Spark, Databricks, and warehousing
- Collaborate closely with Product, Data Science, QA, and Frontend teams
- Act as escalation point for complex data and performance issues
Required Skills & Experience :
Core Technical Expertise :
- 8-10 years in data engineering and platform development
- Strong hands-on experience with Apache Spark (architecture, tuning, internals)
- Deep experience with Databricks and Lakehouse architectures
- Advanced SQL and data modeling expertise
- Strong understanding of distributed data processing
Data Warehouse & Analytics :
- Proven experience designing enterprise-scale data warehouses
- Strong grasp of dimensional modeling and analytical schemas
- Experience supporting BI, reporting, and ad-hoc analytics
- Understanding of semantic layers and analytics consumption patterns
Machine Learning & Data Quality :
- Experience supporting ML pipelines and feature engineering
- Strong understanding of data requirements for training and inference
- Hands-on experience with data validation, quality checks, and observability
- Ability to design data platforms with ML-readiness as a first-class concern
Leadership & Communication :
- Proven ability to lead technical teams and architecture initiatives
- Strong mentoring and coaching skills
- Ability to translate business problems into scalable data solutions
- Comfortable influencing cross-functional stakeholders
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1611814