Posted on: 07/10/2025
Description : Big Data DeveLoper.
Essential Functions :
- Design and implement data pipelines for migration from HDFS/Hive to cloud object storage (e.g., S3, Ceph).
- Optimize Spark (and optionally Flink) jobs for performance and scalability in a Kubernetes environment.
- Ensure data consistency, schema evolution, and governance with Apache Iceberg or equivalent table formats.
- Support migration strategy definition by providing technical input and identifying risks.
- Mentor junior developers and review their code / design decisions.
- Collaborate with platform engineers, cloud architects, and product stakeholders to align technical implementation with project goals.
- Troubleshoot complex distributed system issues in data pipelines or storage integration.
Qualifications :
- Experience 7 to 12 Years.
- Scala and Python.
- Apache Spark (batch & streaming) must!
- Deep knowledge of HDFS internals and migration strategies.
- Experience with Apache Iceberg (or similar table formats like Delta Lake / Apache Hudi) for schema evolution, ACID transactions, and time travel.
- Running Spark and/or Flink jobs on Kubernetes (e.g., Spark-on-K8s operator, Flink-on-K8s).
- Experience with distributed blob storages like Ceph or AWS S3 and similar.
- Building ingestion, transformation, and enrichment pipelines for large-scale datasets.
- Infrastructure-as-Code (Terraform, Helm) for provisioning data infrastructure.
- Ability to work independently while guiding juniors.
Would be a plus :
- Experience with Apache Flink.
- Prior experience in migration projects or large-scale data platform modernization.
- Apple experience preferred (to enable him/her to get up to speed on our tooling set quickly and more independently).
Did you find something suspicious?
Posted By
Naga Bharadwaj Kunapuli
Lead Talent Acquisition at GRID DYNAMICS PRIVATE LIMITED
Last Active: 10 Nov 2025
Posted in
Data Engineering
Functional Area
Big Data / Data Warehousing / ETL
Job Code
1556444
Interview Questions for you
View All