Your work will directly contribute to business intelligence, analytics, and machine learning initiatives by ensuring that reliable, high-quality data is available and accessible across the organization.

Key Responsibilities :

- Design, build, and maintain scalable, fault-tolerant data pipelines on Google Cloud Platform using tools like DataProc, DataFlow, and Cloud Composer.

- Implement distributed data processing using Apache Spark (PySpark) on DataProc and stream/batch processing using Apache Beam in DataFlow.

- Perform ETL (Extract, Transform, Load) operations to prepare data for downstream analytics and reporting.

- Work closely with data scientists, analysts, and business users to define data requirements and deliver actionable insights through BigQuery.

- Build and maintain data models, data marts, and automated pipelines for structured and unstructured data.

- Monitor and troubleshoot performance issues in real time; proactively tune pipelines and workflows for cost and speed efficiency.

- Implement data governance standards around data quality, privacy, security, and compliance (GDPR, HIPAA, etc.

- Write and maintain technical documentation for data flows, architecture, ETL jobs, and data transformations.

- Participate in code reviews and contribute to a culture of continuous improvement and high code quality.

- Collaborate in agile development environments using tools such as Git, JIRA, and CI/CD frameworks.

Must-Have Skills & Qualifications :

- Strong hands-on experience with Google Cloud Platform (GCP) services, especially :

1. DataProc (Apache Spark/Hadoop)

2. DataFlow (Apache Beam)

3. BigQuery (SQL-based analytics and warehousing)

4. Cloud Storage, Pub/Sub, IAM, and Monitoring tools.

- Proficient in Python and PySpark, with experience building modular, reusable ETL pipelines.

- Experience with real-time and batch data processing.

- Deep understanding of distributed computing, data partitioning, and pipeline orchestration.

- Solid foundation in SQL and performance optimization in BigQuery.

- Strong understanding of data security, data privacy, and industry best practices in cloud environments

Did you find something suspicious?

Posted By

Jay Thanki

HR Consultant at infinitrix Consulting

Last Active: 25 Nov 2025

Job Views:
44

Applications: 22

Recruiter Actions: 0

Posted in

Data Engineering

Functional Area

Data Engineering

Job Code

1514595

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers