Posted on: 05/12/2025
Description :
Responsibilities :
- Building PB Scale Data Pipelines : Building highly performant, large-scale data infrastructure that can scale to 100K+ jobs and handle PB-scale data per day.
- Cloud-Native Data Infrastructure : Design and implement a robust, scalable data infrastructure on AWS, utilising Kubernetes and Airflow for efficient resource management and deployment.
- Intelligent SQL Ecosystem : Design and develop a comprehensive SQL intelligence system encompassing query optimisation, dynamic pipeline generation, and data lineage tracking. Leverage your expertise in SQL query profiling, AST analysis, and parsing to create a sophisticated engine focused on query performance improvements, building adaptive data pipelines, and implementing granular column-level lineage.
Impact :
- Innovation at the Forefront : Push the boundaries of data engineering by combining traditional techniques with cutting-edge AI technologies.
- High Visibility : Directly affect the productivity and capabilities of global data teams, as your contributions will be crucial to the daily operations of thousands of users spread across 100s of countries.
- Open Source Contribution : As part of our commitment to the developer community, you will contribute to our open-source initiatives, gaining recognition in the tech community.
- Career Growth : This role is a launchpad into the rapidly advancing field of AI-powered data engineering, offering exposure to state-of-the-art technologies and generative AI applications.
Requirements :
- 8+ years of experience in data engineering, with a focus on building scalable data pipelines and systems.
- Strong proficiency in Python and SQL.
- Extensive experience with SQL query profiling, optimisation, and performance tuning, preferably with Snowflake.
- Deep understanding of SQL Abstract Syntax Tree (AST) and experience working with SQL parsers (e. g., sqlglot) for generating column-level lineage and dynamic ETLs.
- Experience in building data pipelines using Airflow or dbt. 7
- [Optional] Solid understanding of cloud platforms, particularly AWS.
- [Optional] Familiarity with Kubernetes (K8S) for containerised deployments.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1585695
Interview Questions for you
View All