We are seeking an experienced Data Engineer with strong hands-on expertise in AWS data services, PySpark, Databricks, and orchestration tools. The ideal candidate will be responsible for building, optimizing, and maintaining scalable ETL/ELT pipelines to support our data analytics, reporting, and business intelligence needs. This role requires strong problem-solving skills, a solid understanding of cloud-based data architectures, and the ability to collaborate with cross-functional teams.

Must-Have Skills :

- AWS S3

- AWS Glue

- AWS Lambda

- Databricks

- Apache Airflow

- PySpark

- SQL

- Amazon Redshift

- AWS Step Functions

Key Responsibilities :

- Design, develop, and maintain end-to-end ETL/ELT pipelines using AWS Glue, PySpark, Databricks, and other AWS-native services.

- Work with structured and semi-structured data stored in AWS S3 and integrate with Redshift to support scalable and efficient data warehousing solutions.

- Automate data workflows using AWS Lambda, Airflow, Step Functions, and related orchestration tools to ensure seamless data movement and processing.

- Develop and optimize complex SQL queries to support data extraction, transformation, reporting, and analytical use cases.

- Monitor and troubleshoot data pipelines, ensuring performance, reliability, and adherence to SLAs.

- Implement data quality checks, validation frameworks, and performance tuning for large-scale data processing workloads.

- Collaborate with data analysts, data scientists, product managers, and business stakeholders to gather requirements and deliver data solutions aligned with business goals.

- Ensure best practices in data engineering, including automation, documentation, version control, and CI/CD for data workflows.

- Stay updated with emerging cloud and big data technologies and recommend improvements to the existing data ecosystem.

Preferred Qualifications (Nice to Have) :

- Experience with AWS CloudFormation, Terraform, or IaC tools

- Knowledge of data governance, cataloging, and security best practices

- Exposure to streaming technologies (Kafka, Kinesis, etc.)

- Experience with Redshift Spectrum or Lake Formation

Educational Background :

- Bachelors or Masters degree in Computer Science, Information Technology, Data Engineering, or a related field