HamburgerMenu
hirist

Data Engineer - Python Programming

Unicorn workforce
Bangalore
6 - 8 Years

Posted on: 28/07/2025

Job Description

Key Responsibilities and Skills :

GCP Data Services :


- Experience with various GCP data services such as BigQuery, Dataflow, Dataproc, Cloud Storage, Pub/Sub, and Data Fusion.

PySpark Development :


- Designing, developing, and optimizing PySpark applications for large-scale data processing, including ETL (Extract, Transform, Load) operations, data transformations, and aggregations on distributed datasets.

Data Pipeline Design :


- Building robust and scalable data pipelines for data ingestion, processing, and delivery, often incorporating real-time and batch processing requirements.

Performance Optimization :


- Tuning PySpark jobs and GCP data services for optimal performance, cost-efficiency, and resource utilization.

Data Modeling and Architecture :


- Understanding data modeling principles, designing efficient data schemas, and contributing to overall data warehouse and lake architectures on GCP.

Python Proficiency :


- Strong programming skills in Python for scripting, automation, data manipulation (using libraries like Pandas), and integrating with GCP services.

SQL Expertise :


- Proficient in SQL for querying data in BigQuery and other data sources, as well as for data validation and analysis.

Troubleshooting and Monitoring :


- Identifying and resolving issues in data pipelines, monitoring data quality and pipeline health, and implementing error handling mechanisms.

Collaboration and Communication :

- Working effectively with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.


Version Control and DevOps :

- Familiarity with version control systems (e.g., Git) and applying DevOps practices for continuous integration and deployment of data solutions.


info-icon

Did you find something suspicious?