- Building PB Scale Data Pipelines: Building highly performant, large-scale data infrastructure that can scale to 100K+ jobs and handle PB-scale data per day.

- Cloud-Native Data Infrastructure: Design and implement robust, scalable data infrastructure on AWS, utilizing Kubernetes and Airflow for efficient resource management and deployment.

- Intelligent SQL Ecosystem: Design and develop a comprehensive SQL intelligence system encompassing query optimization, dynamic pipeline generation, and data lineage tracking.

- Leverage your expertise in SQL query profiling, AST analysis, and parsing to create a sophisticated engine focused on query performance improvements, building adaptive data pipelines, and implementing granular column-level lineage.

Requirements :

- 8+ years of experience in data engineering, with a focus on building scalable data pipelines and systems.

- Strong proficiency in Python and SQL.

- Extensive experience with SQL query profiling, optimization, and performance tuning, preferably with Snowflake.

- Deep understanding of SQL Abstract Syntax Tree (AST) and experience working with SQL parsers (e. g., sqlglot) for generating column-level lineage and dynamic ETLs.

- Experience in building data pipelines using Airflow or dbt.

- [Optional] Solid understanding of cloud platforms, particularly AWS.

- [Optional] Familiarity with Kubernetes (K8s) for containerized deployments.