HamburgerMenu
hirist

Lead Data Engineer - Python/Spark

Worksconsultancy
6 - 8 Years
Bangalore

Posted on: 24/03/2026

Job Description

Description :


As a Lead Data Engineer, you will define and drive the enterprise data engineering strategy for Nikes next-generation unified analytics foundation spanning Digital, Stores, and Marketplace channels.

This role owns the end to-end data architecture roadmap, including the complete divestiture of Snowflake and successful transition to a Databricks/Spark Lakehouse ecosystem on AWS, while ensuring 95% KPI alignment and metric consistency across the enterprise.

You will operate as both a hands-on technical leader and a strategic architect, influencing platform design decisions, governance models, and modernization programs at global scale.

Key Responsibilities :

Architecture & Technical Leadership :


- Define the target-state data architecture for Nikes unified analytics platform using Databricks, Spark, and AWS-native services.

- Own and execute the Snowflake divestiture strategy, ensuring zero residual footprint and seamless continuity of business reporting.

- Lead the design of highly scalable, secure, and cost-efficient data pipelines across batch and streaming workloads.

- Establish architectural standards for data modeling, storage formats, and performance optimization.

Data Engineering & Platform Strategy :


- Design and implement ETL/ELT pipelines using Python, Spark, and SQL, enabling large-scale data transformation and advanced analytics.

- Build pipelines leveraging AWS S3, Lambda, EMR, and Databricks, optimized for reliability and performance.

- Enable real-time and near-real-time data processing using Kafka, Kinesis, and Spark Streaming.

- Drive containerized deployment strategies using Docker and Kubernetes.

Orchestration, CI/CD & Infrastructure :


- Lead global orchestration standards using Apache Airflow for complex, cross-domain workflows.

- Implement CI/CD pipelines using Git, Jenkins, and enforce best practices for quality, security, and automation.

- Own infrastructure provisioning through Infrastructure as Code (Terraform / CloudFormation).

Data Governance & Enterprise Metrics :


- Establish and govern enterprise-wide data lineage, cataloging, and access control using Unity Catalog and metadata-driven designs.

- Define and manage metric dictionaries and KPI frameworks, ensuring semantic consistency across domains.

- Partner with analytics, product, and business teams to drive ?95% KPI alignment and trusted insights

Observability & Operational Excellence :


- Implement robust monitoring, alerting, and observability across pipelines and platforms.

- Define SLAs, SLOs, and operational playbooks to support mission-critical analytics workloads.

- Mentor and technically guide senior and mid-level engineers, raising the overall engineering bar.

Must-Have Qualifications :


- 6 to 8+ years of experience in data engineering, distributed systems, and platform architecture with clear technical ownership.

- Deep AWS expertise, including S3, Lambda, EMR, and Databricks in large-scale production environments.

- Advanced Python for data processing, automation, testing, and optimization.

- Advanced SQL expertise for complex querying, windowing functions, data modeling, and performance tuning.

- Demonstrated success in modernizing legacy platforms and migrating complex analytics logic to Databricks/Spark Lakehouse architectures.

- Strong experience with data governance, lineage, cataloging, and enterprise metric management.


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in