About the Role :

We are seeking a highly skilled Data Engineer to design, build, and manage robust data pipelines and frameworks on Google Cloud Platform (GCP).

The ideal candidate will have hands-on experience in PySpark, Python, GCP services (BigQuery, Cloud Functions, Pub/Sub), and Terraform, with strong capabilities in pipeline development, monitoring, and documentation (HLD & LLD).

Key Responsibilities :

Data Pipeline Development :

- Design, build, and optimize scalable ETL/ELT data pipelines using PySpark and Python.

- Implement GCP-native solutions leveraging BigQuery, Cloud Functions, Pub/Sub, and related services.

- Use Terraform to automate infrastructure provisioning and deployments.

Pipeline Monitoring & Reliability :

- Implement monitoring, logging, and alerting mechanisms to ensure pipeline reliability and data quality.

- Troubleshoot pipeline issues and optimize performance.

Architecture & Documentation :

- Contribute to High-Level Design (HLD) and Low-Level Design (LLD) documents for data solutions.

- Collaborate with architects, data scientists, and business teams to translate requirements into technical specifications.

Collaboration & Best Practices :

- Work with cross-functional teams to integrate pipelines into broader data platforms.

- Follow best practices for code quality, version control, CI/CD, and security.

Required Skills & Experience :

- Strong proficiency in PySpark and Python for data processing.

- Hands-on experience with GCP services : BigQuery, Cloud Functions, Pub/Sub.

- Infrastructure-as-Code expertise with Terraform.

- Experience in building, deploying, and monitoring large-scale data pipelines.

- Knowledge of data architecture and ability to prepare HLD and LLD documentation.

- Strong problem-solving skills and ability to work in agile environments.

Preferred Qualifications :

- Experience in technologies such as Hadoop, Hive, kafka, snowflake, Matillion and AWS

- Knowledge of CI/CD pipelines (Jenkins, GitLab, Git Actions etc.

- Familiarity with data governance, lineage, and security frameworks.

- Experience with containerization (Docker, Kubernetes) is a plus