Posted on: 14/10/2025
Description :
Responsibilities :
- Set up workflows and orchestration processes to streamline data pipelines and ensure efficient data movement within the Azure ecosystem.
- Create and configure compute resources within Databricks, including All-Purpose and SQL Compute and Job Clusters to support data processing and analysis.
- Set up and manage Azure Data Lake (ADLS) Gen 2 storage accounts and establish a seamless integration with Databricks Workspace for data ingestion and processing.
- Create and manage Service Principals, key vaults to securely authenticate and authorize access to Azure resources.
- Utilize ETL (Extract, Transform, Load) techniques to design and implement data warehousing solutions and ensure compliance with data governance policies.
- Develop highly automated ETL scripts for data processing.
- Scale infrastructure resources based on workload requirements, optimizing performance and cost-efficiency.
- Profile new data sources in a different format, including CSVs, JSONs, etc.
- Apply problem-solving skills to address complex business and technical challenges, such as data quality issues, performance bottlenecks, and system failures.
- Demonstrate excellent soft skills and the ability to effectively communicate and collaborate with clients, stakeholders, and cross-functional teams.
- Implement Continuous Integration/Continuous Deployment (CI/CD) practices to automate the deployment and testing of data pipelines and infrastructure changes.
- Delivering tangible value very rapidly, collaborating with diverse teams of varying backgrounds and disciplines.
- Codifying best practices for future reuse in the form of accessible, reusable patterns, templates, and code bases.
- Manage timely, appropriate communication and relationships with clients, partners, and other stakeholders.
- Create and manage periodic reporting of project execution status and other trackers in standard accepted formats.
Requirements :
- Exp in the Data Engineering domain : 2+ Years.
- Skills : SQL, Python, PySpark. Spark, Distributed Systems.
- Azure Databricks, Azure Data Factory, ADLS Gen 2 Blob Storage
- Key Vaults, Azure DevOps.
- ETL, Building Data Pipelines, Data Warehousing, Data Modelling, and Governance.
- Agile Practices, SDLC, Multi-year experience with Azure-Databricks ecosystem and PySpark.
- Ability to write clean, concise, and organized PySpark code.
- Ability to break down the project into executable steps, prepare a DFD, and execute the same.
- Propose innovative DE solutions to achieve business objectives. Quick on his feet, good at tech, and has logically complex communication.
- Good Knowledge of ADF, Docker /containerization.
Good to Have :
- Event Hubs, Logic Apps.
- Power BI Competitive coding and knows most PySpark syntax by heart.
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1560044