Posted on: 10/07/2025
Role Summary :
We are looking for an experienced Data Engineer to design, build, and optimize large-scale data processing systems.
The ideal candidate will have strong expertise in Python programming, Apache Spark for big data processing, and Apache Airflow for workflow orchestration.
This role requires a deep understanding of data architecture principles and best practices to enable scalable, reliable, and efficient data pipelines that support advanced analytics and business intelligence.
Key Responsibilities :
- Design and implement scalable data architectures and pipelines leveraging Apache Spark and Python to process large datasets efficiently.
- Develop and maintain workflow orchestration solutions using Apache Airflow to automate complex data processing tasks.
- Collaborate with data engineering, data science, and analytics teams to understand data requirements and translate them into robust architectural solutions.
- Define data standards, governance, and best practices to ensure data quality, security, and compliance.
- Optimize data storage and retrieval strategies across data lakes, warehouses, and streaming platforms.
- Evaluate new tools, frameworks, and technologies to improve data platform capabilities.
- Monitor and troubleshoot data pipeline performance issues, ensuring high availability and reliability.
- Document architectural designs, data flows, and operational processes.
- Mentor and guide junior data engineers and analysts on architectural best practices and technical skills.
- Work closely with cloud platform teams to integrate and deploy data solutions (e., AWS, Azure, or GCP).
Required Skills & Experience :
- Strong programming skills in Python for data processing, scripting, and automation.
- Hands-on expertise in Apache Spark for batch and stream processing of large-scale datasets.
- Experience designing and managing workflows with Apache Airflow, including DAG development and monitoring.
- Solid understanding of data architecture principles, ETL/ELT design patterns, and distributed computing.
- Experience with relational and NoSQL databases, data lakes, and data warehouses.
- Familiarity with cloud data services and infrastructure (AWS, Azure, or GCP).
- Proficient in SQL and data modeling techniques.
- Knowledge of data governance, security practices, and compliance standards.
- Strong analytical, problem-solving, and communication skills.
- Ability to work collaboratively in cross-functional teams and mentor junior members.
Preferred Qualifications :
- Bachelors or Masters degree in Computer Science, Engineering, or related field.
- Experience with containerization and orchestration tools (Docker, Kubernetes).
- Familiarity with other data processing frameworks (Kafka, Flink).
- Exposure to machine learning pipelines and MLOps practices.
- Certifications related to Big Data, Cloud, or Data Architecture
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1511384
Interview Questions for you
View All