HamburgerMenu
hirist

Staff Data Engineer - Python/SQL

Enter
Multiple Locations
9 - 13 Years

Posted on: 05/12/2025

Job Description

Description :


Responsibilities :


- Building PB Scale Data Pipelines : Building highly performant, large-scale data infrastructure that can scale to 100K+ jobs and handle PB-scale data per day.


- Cloud-Native Data Infrastructure : Design and implement a robust, scalable data infrastructure on AWS, utilising Kubernetes and Airflow for efficient resource management and deployment.


- Intelligent SQL Ecosystem : Design and develop a comprehensive SQL intelligence system encompassing query optimisation, dynamic pipeline generation, and data lineage tracking. Leverage your expertise in SQL query profiling, AST analysis, and parsing to create a sophisticated engine focused on query performance improvements, building adaptive data pipelines, and implementing granular column-level lineage.


Impact :


- Innovation at the Forefront : Push the boundaries of data engineering by combining traditional techniques with cutting-edge AI technologies.


- High Visibility : Directly affect the productivity and capabilities of global data teams, as your contributions will be crucial to the daily operations of thousands of users spread across 100s of countries.


- Open Source Contribution : As part of our commitment to the developer community, you will contribute to our open-source initiatives, gaining recognition in the tech community.


- Career Growth : This role is a launchpad into the rapidly advancing field of AI-powered data engineering, offering exposure to state-of-the-art technologies and generative AI applications.


Requirements :


- 8+ years of experience in data engineering, with a focus on building scalable data pipelines and systems.


- Strong proficiency in Python and SQL.


- Extensive experience with SQL query profiling, optimisation, and performance tuning, preferably with Snowflake.


- Deep understanding of SQL Abstract Syntax Tree (AST) and experience working with SQL parsers (e. g., sqlglot) for generating column-level lineage and dynamic ETLs.


- Experience in building data pipelines using Airflow or dbt. 7


- [Optional] Solid understanding of cloud platforms, particularly AWS.


- [Optional] Familiarity with Kubernetes (K8S) for containerised deployments.


info-icon

Did you find something suspicious?