Posted on: 10/04/2026
Job Summary :
We are seeking an experienced Python PySpark Developer with strong expertise in big data technologies and data processing.
The ideal candidate will have hands-on experience in building scalable data pipelines using PySpark, Python, and SQL, along with exposure to Java and Pandas for data manipulation and processing.
Key Responsibilities :
- Design, develop, and maintain scalable data pipelines using Python and PySpark
- Process and analyze large datasets using Pandas and Spark DataFrames
- Write optimized queries using SQL for data extraction and transformation
- Work with Java-based components where required in the data ecosystem
- Perform data cleansing, transformation, and validation
- Optimize data workflows for performance and scalability
- Collaborate with cross-functional teams including Data Engineers and stakeholders
- Ensure data quality, integrity, and consistency
Required Skills :
- Strong experience in Python, PySpark, Pandas, and SQL
- Should have 5+ years of experience in similar role
- Good knowledge of Java (for integration or backend support)
- Hands-on experience with Apache Spark (RDD, DataFrames, Spark SQL)
- Strong understanding of ETL processes and data pipelines
- Experience with big data tools (Hadoop, Hive, etc.)
- Strong problem-solving and analytical skills
Preferred Skills :
- Experience with Airflow or other orchestration tools
- Exposure to cloud platforms (AWS / Azure / GCP)
- Knowledge of data warehousing and data lakes
- Familiarity with CI/CD and version control (Git)
Did you find something suspicious?
Posted by
Posted in
Data Analytics & BI
Functional Area
Big Data / Data Warehousing / ETL
Job Code
1627473