Role : Data Engineer

Location : Remote

Exp : 5 to 9 years

Key Responsibilities :

- Design, develop, and maintain scalable ETL/ELT data pipelines using PySpark, SQL, and Python.

- Work with AWS data services (Glue, Redshift, S3, Athena, EMR, Lambda, etc.) to manage large-scale data processing.

- Implement data ingestion, transformation, and integration from multiple structured/unstructured data sources.

- Optimize query performance and ensure data quality, consistency, and reliability across pipelines.

- Collaborate with Data Scientists, Analysts, and Business stakeholders to enable data-driven decision-making.

- Implement best practices for data governance, security, and compliance. Monitor, troubleshoot, and improve pipeline performance in production.

Required Skills & Qualifications :

- 5 years of professional experience in data engineering or big data technologies.

- Strong programming skills in Python and expertise in PySpark (RDD/DataFrame/SQL APIs).

- Strong experience in SQL (query optimization, joins, window functions, stored procedures).

- Hands-on experience with AWS data services (S3, Glue, Athena, Redshift, EMR, Lambda, IAM).

- Solid understanding of data modeling, ETL/ELT concepts, and data warehouse design.

- Experience with version control (Git), CI/CD pipelines, and Agile methodologies.

- Strong problem-solving and debugging skills.