Posted on: 17/10/2025
Key Responsibilities :
- Design and Development : Lead the end-to-end design, development, and implementation of complex, metadata-driven ETL/ELT data pipelines using Azure Data Factory (ADF), ensuring data is accurately extracted, transformed, and loaded from various disparate source systems (on-premise, cloud, APIs, files) into target data stores.
- Big Data Processing : Utilize Azure Databricks (using PySpark/Spark SQL/Scala) for large-scale data transformation, cleansing, aggregation, and complex data processing tasks, including working with structured, semi-structured, and unstructured data formats (Parquet, Delta Lake, JSON, CSV).
- Data Storage and Modeling : Design and manage data storage solutions on Azure, including Azure Data Lake Storage (ADLS) Gen2, Azure Synapse Analytics (Dedicated/Serverless SQL Pools), and Azure SQL Database, ensuring optimal performance and cost-effectiveness.
- Performance Tuning : Proactively monitor, optimize, and troubleshoot existing data pipelines and data solutions for performance bottlenecks, high efficiency, and cost optimization.
- Data Quality and Governance : Implement robust data validation, data cleansing, error handling, and data governance frameworks, leveraging tools like Azure Purview, to ensure data quality, integrity, security, and compliance.
- Automation and DevOps : Implement CI/CD pipelines using Azure DevOps (Boards, Repos, Pipelines) for automated deployment, testing, and continuous integration of data solutions.
Collaboration and Mentorship : Collaborate closely with Data Architects, Data Analysts, Data Scientists, and business stakeholders to translate complex business requirements into scalable technical solutions. May mentor junior engineers on best practices and Azure data technologies.
Documentation : Create and maintain comprehensive documentation for all ETL/ELT processes, data flows, data models, and solution architectures.
Required Technical Skills & Experience
5-9 years of overall experience in IT, with at least 4+ years focused on data engineering/ETL development on the Microsoft Azure Cloud Platform.
Expert-level proficiency with Azure Data Factory (ADF) for data orchestration, pipeline development, and managing self-hosted integration runtimes.
Strong hands-on experience with Azure Databricks and PySpark (or Scala) for building and optimizing data transformation jobs on large datasets.
In-depth knowledge of cloud-native data storage : Azure Data Lake Storage (ADLS) Gen2 and Azure Synapse Analytics (Dedicated SQL Pools/Spark Pools).
Expertise in SQL (T-SQL, Spark SQL) for advanced querying, stored procedures, function development, and query performance tuning.
Proficiency in one or more programming/scripting languages : Python is mandatory, and familiarity with PowerShell or Scala is a plus.
Solid understanding of Data Warehousing concepts, Data Modeling (Star/Snowflake schema, dimensional modeling), and relational database principles.
Experience implementing data security measures (RBAC, encryption) and managing data lineage/governance within Azure.
Familiarity with modern data architectural patterns (e.g., Data Mesh, Data Lakehouse, Delta Lake).
Desired Qualifications (Nice-to-Have)
Experience with real-time/streaming data processing using Azure Event Hubs, Azure Stream Analytics, or Kafka.
Knowledge of Azure Functions for custom data processing logic.
Experience with other ETL tools such as Informatica, Talend, or SSIS.
Certification : Microsoft Certified : Azure Data Engineer Associate (DP-203).
Familiarity with Power BI or other data visualization tools.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1561968
Interview Questions for you
View All