Posted on: 18/07/2025
Job Description :
What You'll Do :
- ETL Development : Develop and maintain complex ETL/ELT pipelines using Python, AWS Glue Jobs, and PySpark. Utilize PyCharm as your primary development environment.
- Big Data Management : Work extensively with Big Data technologies to process, store, and manage large datasets.
- Virtual Machine Management : Manage and optimize data processing environments on virtual machines, including Google VM instances, AWS EC2 Instances, and Colab.
- Data Quality & Performance : Ensure the high quality, accuracy, and performance of data throughout its lifecycle within the data platform.
- Collaboration : Work closely with data scientists, analysts, and other engineering teams to understand data requirements and deliver robust data solutions.
- Monitoring & Optimization : Monitor data pipelines and systems for performance, reliability, and cost-effectiveness, implementing optimizations as needed.
- Visualization Support (Good to have) : Collaborate with data analysts on data visualization needs, potentially utilizing tools like Looker Studio.
Key Requirements :
- Strong expertise in cloud platforms: AWS S3 and GCP (Google Cloud Platform) are must-haves.
- Proficiency with cloud databases/data warehouses: AWS Redshift/S3, GCP BigQuery, and Big Data technologies are must-haves.
- Mandatory hands-on experience with ETL tools and languages: Python, AWS Glue Jobs, PySpark, and PyCharm.
- Experience working with Virtual Machines such as Colab, Google VM instance, and EC2 Instances is a must have.
- Strong understanding of data modeling, data warehousing, and ETL/ELT principles.
Good to Have :
- Familiarity with other data orchestration tools (e.g., Airflow).
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1514823
Interview Questions for you
View All