Posted on: 11/08/2025
Job Description :
- Expertise in designing, implementing, and maintaining data solutions including delta lake, data warehouse, data marts and data pipelines on the Databricks platform that support business and technology objectives.
- Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using AWS cloud-based services.
- Proficiency in ETL implementation using AWS databricks, including hands on experience in predictive optimization, unity catalogue and Managed Delta tables.
- Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.
- Perform data transformation tasks, including data cleansing, aggregation, enrichment, and normalization, using Databricks and related technologies.
- Experience in extracting data from heterogenous sources vis. Flat Files, APIs, XML, RDBMs and implementing complex transformations vis. SCDs etc in databricks notebooks.
- Monitor and troubleshoot data pipelines, identifying and resolving performance issues, data quality problems, and other technical challenges.
- Implement best practices for data governance, data security, and data privacy within the Databricks environment.
- Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts, and demonstrating it to stakeholders.
- Proven skill sets in AWS Data Engineering and Data Lake services such as AWS Glue, S3, Lambda, SNS, IAM etc.
- Strong SQL, Python, PySpark scripting hands on knowledge/experience.
- Experience in data migration projects from On Prem to AWS Cloud.
- Experiences with design, develop, and implement end-to-end data engineering solutions using Databricks for large-scale data processing and data integration projects.
- Build and optimize data ingestion processes from various sources, ensuring data quality, reliability, and scalability.
- Ability to understand and articulate requirements to technical and non-technical audiences.
- Experience in code conversion from native ETL to pyspark code.
- Collaborate with DevOps and infrastructure teams to optimize the performance and scalability of Databricks clusters and resources.
- Perform the code deployment using CICD.
- Stakeholder management and communication skills, including prioritizing, problem solving and interpersonal relationship building.
- Provide guidance and mentorship to junior data engineers, fostering a culture of knowledge sharing and continuous learning within the team
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1528354
Interview Questions for you
View All