HamburgerMenu
hirist

Grid Dynamics - Big Data Developer - Hive/Spark

Posted on: 07/10/2025

Job Description

Description : Big Data DeveLoper.



Essential Functions :


- Design and implement data pipelines for migration from HDFS/Hive to cloud object storage (e.g., S3, Ceph).


- Optimize Spark (and optionally Flink) jobs for performance and scalability in a Kubernetes environment.


- Ensure data consistency, schema evolution, and governance with Apache Iceberg or equivalent table formats.


- Support migration strategy definition by providing technical input and identifying risks.


- Mentor junior developers and review their code / design decisions.


- Collaborate with platform engineers, cloud architects, and product stakeholders to align technical implementation with project goals.


- Troubleshoot complex distributed system issues in data pipelines or storage integration.


Qualifications :


- Experience 7 to 12 Years.


- Scala and Python.
- Apache Spark (batch & streaming) must!


- Deep knowledge of HDFS internals and migration strategies.


- Experience with Apache Iceberg (or similar table formats like Delta Lake / Apache Hudi) for schema evolution, ACID transactions, and time travel.


- Running Spark and/or Flink jobs on Kubernetes (e.g., Spark-on-K8s operator, Flink-on-K8s).


- Experience with distributed blob storages like Ceph or AWS S3 and similar.


- Building ingestion, transformation, and enrichment pipelines for large-scale datasets.


- Infrastructure-as-Code (Terraform, Helm) for provisioning data infrastructure.


- Ability to work independently while guiding juniors.



Would be a plus :


- Experience with Apache Flink.


- Prior experience in migration projects or large-scale data platform modernization.


- Apple experience preferred (to enable him/her to get up to speed on our tooling set quickly and more independently).


info-icon

Did you find something suspicious?