Data Engineer/Lead- Hungary

Experience : 2-12 Years

Location : Hungary (Remote/Hybrid)

About the Role :

We are looking for a Data Engineer/Lead to design, build, and optimize large-scale data pipelines and cloud-native data platforms.

This role is open to candidates from 2 to 12 years of experience, and the scope of responsibilities will naturally evolve with seniority-from building data workflows to leading data architecture and guiding cross-functional data teams.

You will work with modern cloud data engineering tools, distributed data processing frameworks, and enterprise-grade ETL/ELT systems while collaborating closely with ML, BI, and analytics teams.

Responsibilities :

Data Pipeline Development :

- Design, develop, and maintain scalable, reliable, and secure data pipelines.

- Implement ingestion, transformation, and data processing workflows with modern data engineering stacks.

- Automate data movement between sources, data lakes, and warehouses.

ETL/ELT Framework Engineering :

- Build ETL/ELT pipelines using tools such as Azure Data Factory, Databricks, PySpark, Airflow, dbt.

- Implement robust data validation, quality checks, partitioning, scheduling, and orchestration strategies.

Cloud Data Engineering :

- Work with cloud-based data tools and storage solutions, including :

- Azure : Data Factory, Synapse, Databricks

- Google Cloud : BigQuery

- Snowflake for cloud data warehousing

- Build and optimize data models using Delta Lake / Lakehouse architectures.

Distributed Data Processing :

- Build high-performance pipelines using Spark, PySpark, Databricks, or other distributed computing engines.

- Optimize code for large datasets, cluster utilization, and cost efficiency.

Real-Time Data & Streaming :

- Work with streaming and event-driven technologies such as Kafka.

- Build streaming pipelines for real-time ingestion and transformations.

Data Quality, Governance & Performance :

- Implement data quality frameworks, schema validation, lineage tracking, and audit mechanisms.

- Optimize SQL queries, Spark jobs, and processing workflows for performance and cost.

- Collaboration with ML, BI & Product Teams

- Partner with Machine Learning, Data Science, and BI teams to deliver clean, reliable, analytics-ready datasets.

- Provide data infrastructure support for ML pipelines and BI dashboards.

- Automation, DevOps & CI/CD for Data

- Use Git for version control and CI/CD for data pipeline deployments.

- Automate pipeline deployments, testing, and environment management.

Documentation & Best Practices :

- Create and maintain technical documentation for pipelines, data flows, and schemas.

- Follow best practices for coding, security, compliance, and cloud operations.

Required Skills :

- Cloud Data Engineering Tools

- Azure Data Factory

- Databricks

- PySpark

- Data Warehousing & Analytics Platforms

- Snowflake

- BigQuery

- Azure Synapse

- Orchestration & Data Processing

- Kafka

- Spark

- Airflow

- dbt

- Programming & Querying

- Python : Pandas, NumPy

- SQL : Advanced querying, query optimization, complex joins, CTEs, window functions

- Lakehouse & Big Data Frameworks

- Delta Lake

- Lakehouse architecture principles

- DevOps & Cloud Tools

- Git

- CI/CD pipelines

- Cloud storage systems (AWS S3, Azure Data Lake Storage, GCS, etc.)

Key Responsibility Areas (Unified for 2-12 Years) :

- Build and maintain scalable, secure, cloud-native data pipelines for batch and streaming workloads.

- Develop robust ETL/ELT workflows using modern data tools and frameworks.

- Ensure high data quality, lineage, governance, and performance optimization.

- Work closely with ML, AI, BI, and analytics teams to deliver reliable, well-structured datasets.

- Implement automation for pipeline deployments, testing, and monitoring.

- Maintain documentation for data flows, models, and pipeline logic.

- Optimize data processing workloads for speed, cost efficiency, and scalability.

- Ensure cloud data systems follow best practices in security, compliance, and reliability.

- Troubleshoot data issues, perform root cause analysis, and deliver long-term fixes.

- Contribute positively to Agile development, code reviews, and cross-functional collaboration