HamburgerMenu
hirist

Job Description

Note : Women Candidates Preferred


Description :


Job Summary :


We are seeking a highly skilled and experienced Data Engineer to lead the design, development, and maintenance of scalable Databricks pipelines. The ideal candidate will have deep expertise in the Azure data ecosystem, specifically leveraging Python, Databricks, and Azure Data Factory (ADF) to transform raw data into actionable insights. You will play a key role in optimizing data architecture and mentoring junior developers.


Key Responsibilities :


- Databricks Pipeline Development : Design, build, and orchestrate robust ETL/ELT pipelines using Azure Data Factory (ADF) and Azure Databricks to ingest data from various on-premises and cloud sources.


- Data Transformation : Utilize Python (PySpark) and SQL to perform complex data transformations, cleaning, and validation within Databricks notebooks.


- Architecture & Optimization : Collaborate with architects to define data models and optimize pipeline performance (latency, throughput, and cost) for large-scale datasets.


- Data Lake Management : Manage and organize data within Azure Data Lake Storage (ADLS Gen2), implementing Delta Lake best practices for ACID transactions and time travel.


- Quality & Governance : Implement automated testing, data quality checks, and monitoring to ensure data integrity and availability.


- CI/CD & DevOps : Manage code versioning and deployment pipelines using Azure DevOps, Git, and CI/CD methodologies.


- Mentorship : Guide junior engineers, conduct code reviews, and establish best practices for coding standards and documentation.


Required Qualifications :


- Experience : 5 to 8 years of proven experience in Data Engineering and development.


Core Technologies :


- Python : Advanced proficiency in Python coding and scripting.


- Databricks : Strong hands-on experience with Azure Databricks, including cluster management, job scheduling, and performance tuning.


- Spark : Deep understanding of Apache Spark architecture and PySpark.


- OADF : Extensive experience creating pipelines, linked services, and datasets in Azure Data

Factory.


- Cloud Storage : Proficiency with Azure Data Lake Storage (ADLS Gen2) and Blob Storage.


- Database Skills : Strong Cassandra and Postgre SQL skills for querying and analyzing data in data lakes.


- Problem Solving : Ability to troubleshoot complex data issues and optimize slow-running queries or jobs.


Preferred Skills ("Surrounding Tools") :


- Experience with Unity Catalog for data governance.


- Understanding of Event Hubs or Kafka for real-time data streaming.


- Understanding on how to call APIs from Python


Soft Skills :


- Strong communication skills to articulate technical concepts to non-technical stakeholders.


- Agile mindset with experience working in Scrum/Kanban teams.


- Proactive approach to learning new technologies and tools.

The job is for:

Women candidates preferred
info-icon

Did you find something suspicious?

Similar jobs that you might be interested in