HamburgerMenu
hirist

Job Description

Job Purpose :

As a Data Engineer - ETL/Spark, you will be responsible for designing, developing, and maintaining data integration workflows and pipelines.

You will work with cross-functional teams to enable seamless data movement, transformation, and storage, ensuring data is accurate, reliable, and optimized for analytics and business intelligence use cases.


Key Responsibilities :


ETL/ELT Development & Maintenance :


- Design, develop, and maintain efficient ETL workflows using modern tools and frameworks.


- Optimize and automate data pipelines to ensure high performance and scalability.


- Monitor and troubleshoot ETL processes to ensure reliability and data quality.

Big Data Engineering with Spark :


- Develop and optimize Spark-based applications for large-scale data processing.


- Implement distributed data processing workflows leveraging Hadoop, Spark, or other big data ecosystems.


- Ensure Spark jobs are performance-tuned and cost-optimized.

Data Management & Integration :


- Work with structured and unstructured data from multiple sources, ensuring proper cleansing, transformation, and loading.


- Integrate data from APIs, databases, cloud services, and third-party platforms.


- Ensure compliance with data governance, security, and privacy standards.

Collaboration & Business Enablement :


- Partner with data analysts, data scientists, and business stakeholders to understand requirements and deliver data solutions.


- Support data-driven initiatives by providing reliable and timely datasets.


- Document processes, data flows, and technical specifications for reference and knowledge sharing.


Requirements :


- Bachelors/Masters degree in Computer Science, Information Technology, or related field.


- 4- 8 years of hands-on experience in ETL development and data engineering.


- Strong expertise in Apache Spark (PySpark/Scala/Java) for distributed data processing.


- Proficiency with ETL tools (e.g., Informatica, Talend, Databricks, AWS Glue, or similar).


- Strong knowledge of SQL and experience with relational databases (e.g., Oracle, SQL Server, PostgreSQL, MySQL).


- Experience with cloud platforms (AWS, Azure, or GCP) and cloud-native data services.


- Familiarity with data warehousing concepts and modern architectures (Snowflake, Redshift, BigQuery).


- Understanding of data governance, data quality, and security best practices.


- Excellent problem-solving, communication, and collaboration skills.


Good to Have :


- Experience with streaming platforms (Kafka, Kinesis, Spark Streaming).


- Knowledge of containerization and orchestration (Docker, Kubernetes, Airflow).


- Exposure to machine learning pipelines and data science collaboration.


- Certifications in AWS/Azure/GCP data services


info-icon

Did you find something suspicious?