Posted on: 13/08/2025
Job Summary :
We are seeking a skilled Data Engineer with strong expertise in Python and PySpark to design, develop, and optimize large-scale data pipelines. The ideal candidate will work closely with data scientists, analysts, and business stakeholders to ensure the delivery of high-quality, reliable, and scalable data solutions.
Key Responsibilities :
- Develop and optimize PySpark jobs for large-scale data processing.
- Integrate data from multiple sources into a unified data platform.
- Collaborate with cross-functional teams to understand data requirements and deliver solutions.
- Perform data cleansing, transformation, and validation to ensure accuracy and reliability.
- Monitor, troubleshoot, and improve data pipeline performance.
- Implement best practices in data governance, security, and compliance.
- Work with cloud-based big data platforms (AWS, Azure, GCP) and distributed systems.
Required Skills & Qualifications :
- Hands-on experience with PySpark for big data processing.
- Solid understanding of data structures, algorithms, and distributed computing concepts.
- Experience with SQL and relational databases (e.g., MySQL, PostgreSQL).
- Knowledge of data warehousing concepts and tools (e.g., Snowflake, Redshift, BigQuery).
- Familiarity with workflow orchestration tools (Airflow, Luigi, etc.).
- Experience working with cloud services for data engineering.
- Strong problem-solving and debugging skills.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1529502
Interview Questions for you
View All