Posted on: 03/09/2025
Job Purpose :
As a Data Engineer - ETL/Spark, you will be responsible for designing, developing, and maintaining data integration workflows and pipelines.
You will work with cross-functional teams to enable seamless data movement, transformation, and storage, ensuring data is accurate, reliable, and optimized for analytics and business intelligence use cases.
Key Responsibilities :
ETL/ELT Development & Maintenance :
- Design, develop, and maintain efficient ETL workflows using modern tools and frameworks.
- Optimize and automate data pipelines to ensure high performance and scalability.
- Monitor and troubleshoot ETL processes to ensure reliability and data quality.
Big Data Engineering with Spark :
- Develop and optimize Spark-based applications for large-scale data processing.
- Implement distributed data processing workflows leveraging Hadoop, Spark, or other big data ecosystems.
- Ensure Spark jobs are performance-tuned and cost-optimized.
Data Management & Integration :
- Work with structured and unstructured data from multiple sources, ensuring proper cleansing, transformation, and loading.
- Integrate data from APIs, databases, cloud services, and third-party platforms.
- Ensure compliance with data governance, security, and privacy standards.
Collaboration & Business Enablement :
- Partner with data analysts, data scientists, and business stakeholders to understand requirements and deliver data solutions.
- Support data-driven initiatives by providing reliable and timely datasets.
- Document processes, data flows, and technical specifications for reference and knowledge sharing.
Requirements :
- Bachelors/Masters degree in Computer Science, Information Technology, or related field.
- 4- 8 years of hands-on experience in ETL development and data engineering.
- Strong expertise in Apache Spark (PySpark/Scala/Java) for distributed data processing.
- Proficiency with ETL tools (e.g., Informatica, Talend, Databricks, AWS Glue, or similar).
- Strong knowledge of SQL and experience with relational databases (e.g., Oracle, SQL Server, PostgreSQL, MySQL).
- Experience with cloud platforms (AWS, Azure, or GCP) and cloud-native data services.
- Familiarity with data warehousing concepts and modern architectures (Snowflake, Redshift, BigQuery).
- Understanding of data governance, data quality, and security best practices.
- Excellent problem-solving, communication, and collaboration skills.
Good to Have :
- Experience with streaming platforms (Kafka, Kinesis, Spark Streaming).
- Knowledge of containerization and orchestration (Docker, Kubernetes, Airflow).
- Exposure to machine learning pipelines and data science collaboration.
- Certifications in AWS/Azure/GCP data services
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1540221
Interview Questions for you
View All