Posted on: 10/08/2025
Key Responsibilities :
- Design, develop, and maintain data pipelines using Spark DataFrames and Hadoop ecosystem components.
- Develop and optimize PySpark (SparkSQL) jobs for large-scale data processing.
- Work on cloud-based solutions using AWS / Azure / GCP.
- Contribute to migration projects within the Hadoop environment.
- Develop automation scripts using Shell scripting and Python to improve data pipeline efficiency.
- Write and optimize SQL queries for data transformation and analytics.
- Integrate workflows using Oozie, Autosys, or other scheduling/workflow tools.
- Collaborate with cross-functional teams following Agile methodologies.
- Stay updated and experiment with emerging technologies to improve system performance and automation.
Must-Have Skills :
- Strong experience in Spark DataFrames and Hadoop ecosystem.
- Hands-on experience with AWS / Azure / GCP cloud platforms.
- Proficiency in PySpark (SparkSQL).
Good-to-Have Skills :
- Knowledge of Shell scripting and Python.
- Strong SQL skills for complex queries.
- Experience with Hadoop migration projects.
- Experience with workflow/scheduling tools like Oozie or Autosys.
- Knowledge of Agile development processes.
- Passion for exploring new technologies and automation.
Why Join Us :
- Work on cutting-edge Big Data & Cloud projects.
- Opportunity to innovate and explore automation in large-scale systems.
- Collaborative and agile work environment with continuous learning.
Did you find something suspicious?
Posted By
Pranjal
Senior Executive- Talent Acquisition at SUPERSOURCING TECHNOLOGIES PRIVATE LIMITED
Last Active: 13 Aug 2025
Posted in
Data Engineering
Functional Area
Big Data / Data Warehousing / ETL
Job Code
1527250
Interview Questions for you
View All