Job Description :

We are looking for an experienced GCP Data Engineer to build and optimize scalable batch and real-time data pipelines that transform complex clinical data into FHIR R4 resources (Experience not mandatory, but should be comfortable to work on related project). Your work will support real-time clinical decision-making and healthcare interoperability.

Mandatory Skills :

- Pyspark, Kafka, Python, SQL, Real-time data processing using Kafka, GCP (Good to have, any cloud experience will work)

Key Responsibilities :

- Design and develop data transformation pipelines using PySpark and Kafka

- Optimize streaming pipelines for high-volume, low-latency processing

- Collaborate with healthcare domain experts

- Implement testing, validation, and data quality checks

- Deploy and manage pipelines using Docker & Kubernetes

- Work with GCP services like BigQuery, Cloud Storage, and GKE (It is optional. Any cloud experience will work)

Mandatory Skills :

- 4+ years of Data Engineering experience

- Strong hands-on experience with Apache PySpark

- Experience with Kafka streaming

- Strong SQL and Python

- Experience with Google Cloud Platform

- Docker and Kubernetes experience