HamburgerMenu
hirist

Data Engineer - Python/SQL/Spark

SDOD TECHNOLOGIES PRIVATE LIMITED
Multiple Locations
3 - 5 Years

Posted on: 24/09/2025

Job Description

Requirements :


- Strong proficiency in writing complex, optimized SQL queries (especially for Amazon Redshift).

- Experience with Apache Spark (preferably on AWS EMR) for big data processing.

- Proven experience using AWS Glue for ETL pipelines (working with RDS, S3 etc. ).

- Strong understanding of data ingestion techniques from diverse sources (files, APIs, relational DBs).

- Solid hands-on experience with Amazon Redshift : data modeling, optimization, and query tuning.

- Familiarity with AWS Quicksight for building dashboards and visual analytics.

- Proficient in Python or PySpark for scripting and data transformation.

- Understanding of data pipeline orchestration, version control, and basic DevOps.


Good-to-have Skills :

- Knowledge of other AWS services (Lambda, Step Functions, Athena, CloudWatch).

- Experience with workflow orchestration tools like Apache Airflow.

- Exposure to real-time streaming tools (Kafka, Kinesis, etc. ).

- Familiarity with data security, compliance, and governance best practices.

- Experience with infrastructure as code (e. g., Terraform, CloudFormation).


Key Responsibilities :


- Develop, maintain, and optimize complex SQL queries, primarily for Amazon Redshift, ensuring high performance and scalability.

- Build and manage ETL pipelines using AWS Glue, processing data from various sources including RDS, S3, APIs, and relational databases.

- Utilize Apache Spark (preferably on AWS EMR) for large-scale data processing and transformation tasks.

- Design efficient data models and optimize Redshift clusters for storage, query performance, and cost-effectiveness.

- Create and maintain data ingestion workflows from diverse sources such as files, APIs, and databases.

- Develop scripts and data transformations using Python or PySpark.

- Implement and monitor data pipeline orchestration with version control and adhere to DevOps best practices.

- Collaborate with analytics and BI teams, leveraging AWS QuickSight for dashboarding and visualization.

- Ensure data quality, security, and compliance throughout the data lifecycle.


info-icon

Did you find something suspicious?