Job Description :

- Expertise in designing, implementing, and maintaining data solutions including delta lake, data warehouse, data marts and data pipelines on the Databricks platform that support business and technology objectives.

- Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using AWS cloud-based services.

- Proficiency in ETL implementation using AWS databricks, including hands on experience in predictive optimization, unity catalogue and Managed Delta tables.

- Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

- Perform data transformation tasks, including data cleansing, aggregation, enrichment, and normalization, using Databricks and related technologies.

- Experience in extracting data from heterogenous sources vis. Flat Files, APIs, XML, RDBMs and implementing complex transformations vis. SCDs etc in databricks notebooks.

- Monitor and troubleshoot data pipelines, identifying and resolving performance issues, data quality problems, and other technical challenges.

- Implement best practices for data governance, data security, and data privacy within the Databricks environment.

- Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts, and demonstrating it to stakeholders.

- Proven skill sets in AWS Data Engineering and Data Lake services such as AWS Glue, S3, Lambda, SNS, IAM etc.

- Strong SQL, Python, PySpark scripting hands on knowledge/experience.

- Experience in data migration projects from On Prem to AWS Cloud.

- Experiences with design, develop, and implement end-to-end data engineering solutions using Databricks for large-scale data processing and data integration projects.

- Build and optimize data ingestion processes from various sources, ensuring data quality, reliability, and scalability.

- Ability to understand and articulate requirements to technical and non-technical audiences.

- Experience in code conversion from native ETL to pyspark code.

- Collaborate with DevOps and infrastructure teams to optimize the performance and scalability of Databricks clusters and resources.

- Perform the code deployment using CICD.

- Stakeholder management and communication skills, including prioritizing, problem solving and interpersonal relationship building.

- Provide guidance and mentorship to junior data engineers, fostering a culture of knowledge sharing and continuous learning within the team