Posted on: 16/07/2025
Job Description :
We are seeking a highly skilled Manager Data Engineer with deep expertise in AWS data services, data wrangling using Python & PySpark, and a solid understanding of data governance, lineage, and quality frameworks. The ideal candidate will have a proven track record of delivering end-to-end data pipelines for logistics, supply chain, enterprise finance, or B2B analytics use cases.
Key Responsibilities :
- Design and implement scalable ETL/ELT pipelines using AWS Glue (3.0+), PySpark, and Athena.
- Build and manage data lakes on S3 using the bronze/silver/gold zone structure.
- Ensure pipelines are audit-ready, with validation logs, schema metadata, and classification tagging using Glue Data Catalog.
- Own end-to-end pipeline stages ingestion, transformation, validation, metadata enrichment, and BI readiness.
- Implement data quality frameworks using tools like Great Expectations to catch nulls, outliers, and rule violations early.
- Maintain data lineage and governance using OpenMetadata or Amundsen.
- Collaborate with Data Scientists for ML pipelines, feature engineering, and I/O (JSON/Parquet) optimization.
- Prepare filterable, flattened datasets for BI tools like Sigma, Power BI, or Tableau.
- Interpret complex business metrics (e.g., forecasted revenue, margin, utilization) and translate them into technical logic.
- Build orchestration workflows using AWS Step Functions, EventBridge, and CloudWatch.
- Ensure delivery aligns with evolving business KPIs and compliance standards.
Required Skills :
- Bachelor's or Masters degree in Computer Science, Data Engineering, or a related field.
- 6 to 9 years of hands-on experience in data engineering.
- Minimum 3 years working with AWS-native data platforms.
- Strong expertise in AWS ecosystem: Glue, Athena, S3, Step Functions, CloudWatch, EventBridge.
- Programming proficiency in Python 3.x, PySpark, and SQL (Athena/Presto).
- Experience with Pandas, NumPy, and time series manipulation.
- Proven track record of implementing data quality, governance, and lineage (Great Expectations, OpenMetadata, PII tagging).
- Experience in building audit logs, metadata tagging, and schema management.
- Ability to translate business metrics into reliable technical pipelines.
- Strong communication and collaboration with data, QA, and business teams.
- Familiarity with feature engineering, KPI logic, and BI-ready data structures.
Preferred Skills :
- Experience in domains such as logistics, supply chain, enterprise finance, or B2B analytics.
- Exposure to ML pipelines, data modeling, and KPI interpretation.
- Knowledge of Parquet/JSON, Agile development, and BI dashboards.
Success in This Role Means :
- You ship production-ready pipelines with embedded validation and lineage.
- You minimize QA rework by proactively handling edge cases and logic clarity.
- You become a go-to expert for data accuracy and business logic interpretation.
- You deliver scalable solutions that are easy to understand for BI, QA, and architecture teams.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1513323
Interview Questions for you
View All