HamburgerMenu
hirist

Data Engineer - PySpark/SQL

GALAXY I TECHNOLOGIES INC
Multiple Locations
6 - 8 Years

Posted on: 26/09/2025

Job Description

Key Responsibilities :

- Design, build, and optimize data pipelines leveraging Apache Spark, PySpark, and SQL.

- Develop scalable data integration, transformation, and modeling solutions for analytics and reporting.

- Implement and manage CI/CD pipelines, release management, and production deployments.

- Collaborate with data science teams to enable advanced analytics workloads using Python and PySpark.

- Orchestrate workflows using Apache Airflow or equivalent platforms (SSIS, etc.

- Support data ingestion from multiple structured/unstructured sources and develop custom connectors where needed.

- Monitor, troubleshoot, and optimize production data pipelines for performance and reliability.

- Work with Kubernetes/Docker for containerized data services and deployments.

- Ensure data quality, security, and compliance across all environments.


Required Technical Skills :


- Apache Spark (Strong expertise Mandatory)

- PySpark (Strong expertise Mandatory)

- Python (Data Science focused)

- SQL / MySQL (Advanced, with query optimization)

- Linux, Git, CI/CD, Release Management, Production Deployment

- Kubernetes / Docker (Good hands-on experience)

- Java (Good to have working knowledge)


Experience Required :


- 6 to 8 years hands-on in core skills : Spark, PySpark, Python, SQL, Data Engineering.

- 5+ years overall experience in data engineering, ELT development, and data modeling.

- Proven experience with workflow orchestration, data ingestion, and performance optimization


info-icon

Did you find something suspicious?