Posted on: 20/11/2025
Description :
About the Role
We are looking for a highly skilled Lead Data Engineer who can take full ownership of designing, building, and scaling modern data pipelines across batch and real-time systems.
You will play a central role in architecting our data platform, ensuring high reliability, quality, and performance across a diverse set of data sources and downstream consumers.
This is a hands-on leadership role suited for someone who thrives in fast-paced environments and enjoys solving complex data engineering challenges end-to-end.
Key Responsibilities :
1. Data Pipeline Architecture & Development
- Design, build, and maintain end-to-end batch and real-time data pipelines using modern data engineering tools and best practices.
- Implement and optimize ingestion frameworks supporting CDC pipelines, streaming data (Kafka/Redpanda), and scheduled batch jobs.
- Build robust, scalable ETL/ELT processes that integrate data from Postgres, MySQL, MongoDB, ClickHouse, Timescale, and external APIs.
2. Data Platform & Storage
- Architect reliable data lake and warehouse solutions using S3/GCS, Parquet, partitioning, and metadata-driven systems.
- Implement schema evolution, deduplication, incremental ingestion, and automated backfills.
- Design cost-effective data storage and query strategies to support large-scale event volumes (millions+ per day).
3. Orchestration, Monitoring & Quality
- Develop workflow orchestration using Airflow, Dagster, or Prefect.
- Build monitoring and alerting for pipeline health using logs, metrics, and dashboards.
- Implement data quality checks, validation frameworks, SLAs, and reconciliation processes to ensure trust in the data.
4. Performance & Optimisation
- Write high-performance SQL (window functions, complex aggregations, indexing, query tuning).
- Optimize pipelines for cost, speed, scalability, and reliability across distributed systems.
- Continuously improve data models and internal tooling to support analytics and downstream applications.
5. Collaboration & Leadership
- Work closely with product, analytics, and engineering teams to gather requirements and translate them into well-designed data solutions.
- Take ownership of projects from concept to deployment with minimal supervision.
- Mentor team members and contribute to building best practices, coding standards, and documentation.
Required Skills & Experience:
- 6+ years in Data Engineering, with at least 2+ years in a senior/lead role.
- Strong expertise in building production-grade data pipelines end-to-end.
- Proven experience with both batch and real-time systems.
- Hands-on experience with Kafka/Redpanda, CDC solutions, and streaming frameworks.
- Deep understanding of data lakes, warehouses, and modern storage formats (Parquet).
- Strong SQL expertise, including complex window functions and performance optimization.
- Experience working with multiple databases: Postgres, MySQL, MongoDB, ClickHouse, Timescale.
- Proficiency with orchestration tools (Airflow/Dagster/Prefect).
- Capable of delivering scalable data architectures in fast-paced, high-growth environments.
- Experience with monitoring, alerting, and ensuring data reliability at scale.
Nice-to-Have:
- Logistics/transportation domain experience.
- Knowledge of geospatial data processing.
- Familiarity with DBT, Lakehouse technologies (Iceberg/Delta/Hudi), and Kubernetes.
- Experience with cost optimization on cloud platforms.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1577877
Interview Questions for you
View All