Posted on: 26/09/2025
Key Responsibilities :
- Design, build, and optimize data pipelines leveraging Apache Spark, PySpark, and SQL.
- Develop scalable data integration, transformation, and modeling solutions for analytics and reporting.
- Implement and manage CI/CD pipelines, release management, and production deployments.
- Collaborate with data science teams to enable advanced analytics workloads using Python and PySpark.
- Orchestrate workflows using Apache Airflow or equivalent platforms (SSIS, etc.
- Support data ingestion from multiple structured/unstructured sources and develop custom connectors where needed.
- Monitor, troubleshoot, and optimize production data pipelines for performance and reliability.
- Work with Kubernetes/Docker for containerized data services and deployments.
- Ensure data quality, security, and compliance across all environments.
Required Technical Skills :
- PySpark (Strong expertise Mandatory)
- Python (Data Science focused)
- SQL / MySQL (Advanced, with query optimization)
- Linux, Git, CI/CD, Release Management, Production Deployment
- Kubernetes / Docker (Good hands-on experience)
- Java (Good to have working knowledge)
Experience Required :
- 5+ years overall experience in data engineering, ELT development, and data modeling.
- Proven experience with workflow orchestration, data ingestion, and performance optimization
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1552737
Interview Questions for you
View All