Posted on: 13/07/2025
Project Overview :
We are seeking skilled Data Engineers with strong hands-on experience in Databricks for a high-impact data platform project.
The project involves building scalable data pipelines, integrating with cloud platforms, and managing data governance using Unity Catalog.
Location : Noida/Gurgaon/Hyderabad/Bangalore/Pune
Key Requirements :
Primary Skills :
- Expertise in Databricks, working on either AWS or Azure (cloud-specific skills are not mandatory unless specified).
- Proficiency in PySpark, Spark SQL, Python, and Delta Lake.
- Hands-on experience in building data pipelines and cloud integration.
- Familiarity with Databricks Unity Catalog and Federated Catalog.
Cloud Environment :
- Engineers working on Databricks on AWS are not required to have Azure-specific skills.
- Engineers working on Databricks on Azure are not required to have AWS-specific skills.
- Multi-cloud exposure (AWS & Azure) is a plus due to the diverse project environment.
Infrastructure as Code (IaC) :
- Experience with Terraform for infrastructure provisioning and automation.
Testing and Quality :
- Experience in writing unit tests and integration tests using pytest.
Key Responsibilities :
- Data Pipeline Development : Design, develop, and optimize scalable and robust data pipelines using Databricks, PySpark, Spark SQL, and Python.
- Cloud Integration : Integrate data pipelines with either AWS or Azure cloud platforms, depending on the project's specific cloud environment.
- Delta Lake Management : Implement and manage data solutions leveraging Delta Lake for reliable and performant data storage and processing.
- Data Governance : Work with Databricks Unity Catalog and Federated Catalog to ensure data governance, security, and discoverability across the data landscape.
- Infrastructure Automation : Utilize Terraform for provisioning and managing cloud infrastructure related to data platform components.
- Testing and Quality Assurance : Develop and execute comprehensive unit tests and integration tests using pytest to ensure data quality and pipeline reliability.
- Performance Optimization : Identify and address performance bottlenecks in data pipelines and queries to ensure optimal efficiency.
- Collaboration : Work closely with data architects, data scientists, and other engineering teams to understand requirements and deliver high-quality data solutions.
- Documentation : Create and maintain technical documentation for data pipelines, processes, and systems
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1512384
Interview Questions for you
View All