Role Overview :

We are seeking a highly skilled Data Engineer with deep hands-on expertise in SQL and PySpark, strong fundamentals in data architecture, and proven experience in building scalable ETL pipelines.

You will be responsible for designing, developing, optimizing and managing data workflows that ensure high-quality, reliable, and accessible data for analytics and reporting.

Key Responsibilities :

- Design, implement, and maintain scalable, production-grade ETL pipelines using PySpark and SQL.

- Develop robust data integration processes ensuring data accuracy, consistency, and timeliness.

- Perform advanced SQL queries, data profiling, and root-cause analysis for data quality issues.

- Optimize data pipelines for performance, cost efficiency, and scalability in distributed systems.

- Implement strong data security, governance, and access control measures per regulatory standards.

- Monitor and troubleshoot pipeline issues with a focus on quick resolution and minimal downtime.

- Work closely with DevOps to deploy and manage data engineering workloads in a cloud environment.

- Prepare and maintain comprehensive documentation, including data flow diagrams and technical specs.

- Stay updated with evolving tools, frameworks, and best practices in the data engineering ecosystem.

Qualifications :

- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.

- Strong hands-on programming experience in Python (PySpark) and advanced SQL.

- Proven experience designing and developing end-to-end data pipelines in real-world environments.

- Strong background in ETL frameworks (e.g., Apache Spark, Flink, Airflow).

- Experience with cloud platforms: AWS / Azure / GCP.

- Good understanding of data warehousing, dimensional modeling, and distributed computing.

- Ability to work with large-scale datasets and optimize data transformations.

- Strong analytical, debugging, and problem-solving skills.

- Excellent communication and collaboration abilities.

- Ability to thrive in an agile, fast-paced, delivery-focused environment