As a Lead Data Engineer, you will lead, design, implement, and maintain data processing pipelines and workflows using Databricks on the Azure platform. Your expertise in PySpark, SQL, Databricks, test-driven development, and Docker will be essential to the success of our data engineering initiatives.

Roles & Responsibilities :

- Collaborate with cross-functional teams to understand data requirements and design scalable and efficient data processing solutions.

- Develop and maintain data pipelines using PySpark and SQL on the Databricks platform.

- Optimise and tune data processing jobs for performance and reliability.

- Implement automated testing and monitoring processes to ensure data quality and reliability.

- Work closely with data scientists, data analysts, and other stakeholders to understand their data needs and provide effective solutions.

- Troubleshoot and resolve data-related issues, including performance bottlenecks and data quality problems.

- Stay up to date with industry trends and best practices in data engineering and Databricks.

Key Requirements :

- 8+ years of experience as a Data Engineer, with a focus on Databricks and cloud-based data platforms, with a minimum of 4 years of experience in writing unit/end-to-end tests for data pipelines and ETL processes on Databricks.

- Hands-on experience in PySpark programming for data manipulation, transformation, and analysis.

- Strong experience in SQL and writing complex queries for data retrieval and manipulation.

- Experience in Docker for containerising and deploying data engineering applications is good to have.

- Strong knowledge of the Databricks platform and its components, including Databricks notebooks, clusters, and jobs.

- Experience in designing and implementing data models to support analytical and reporting needs will be an added advantage.