We are seeking highly skilled Databricks Data Engineers to join our data modernization team. You will play a pivotal role in designing, developing, and maintaining robust data solutions on the Databricks platform. Your experience in data engineering, along with a deep understanding of Databricks, will be instrumental in building solutions to drive data-driven decision-making across a variety of customers.

Experience Band : 4 - 10 years

Location : Bangalore.

Mandatory Skills : AWS Glue, Lamda, PySpark, PYSQL and Databricks.

Good to have Skills : Redshift, Airflow, DLT, Databricks administration

This position is responsible for developing the process of ETL / ELT / File Movement of data & integrations. The key responsibilities will be to process and move data between different compute and storage services, as well as on-premises data sources at specified intervals. The employee will also be responsible for the creation, scheduling, orchestration, and management of data pipelines.

Data engineers are responsible for ensuring the availability and quality of data needed for analysis and business transactions. This includes data integration, acquisition, cleansing, harmonization and transforming raw data into curated datasets for data science, data discovery, and BI/analytics. Responsible for developing, constructing, testing, and maintaining data sets and scalable data processing systems.

Data engineers work closest with Data Architects and Data Scientists. They also work with business and IT groups beyond the data sphere, understanding the enterprise infrastructure and the many source systems.

Input is raw datasets. Output is analytics-ready, integrated/curated datasets.

Responsibilities :

- Development experience as a data engineer with focus on core tools and technologies like AWS Lambda, Glue, S3, Pyspark, SQL and Databricks. Experience in Redshift, Athena and Airflow is added advantage.

- Design, develop, and optimize ETL/ELT data pipelines using Glue / Lamda with Databricks.

- Strong SQL and Python/PySpark skills for data transformation and analysis.

- Work with structured, semi-structured, and unstructured data sources.

- Troubleshoot and optimize data workflows for scalability and performance.

- Strong experience with optimized performance-oriented ELT/ETL design and implementations in context of large and complex datasets using PySpark, SQL, Databricks, Glue or Lamda

- Design and build high-performance and scalable data pipelines adhering to delta/ medallion/ data-lake house, data warehouse & data marts standards for optimal storage, retrieval, and processing of data.

- Databricks : Hands-on experience with Databricks for data processing, analytics, and Serverless SQLWH, Unity Catalogue, optimization of code and cluster.

- Develop data profiling and data quality methodologies and embed them into the processes involved in transforming data across the systems.

- Experience in Agile Development and code deployment using Github & CI-CD pipelines.

- Ability to work with business owners to define key business requirements and convert to technical specifications

- Experience with security models and development on large data sets

- Responsible for system testing, ensuring effective resolution of defects, timely discussion around business issues and appropriate management of resources relevant to data and integration

Preferred Qualifications / Certifications :

- Bachelors degree in computer science, information technology, management information systems or equivalent work experience

- Experience working in regulated environments and with internal systems quality policies and procedures.

- Experience in development and deployment on cloud infrastructure.

- Pharmaceutical or healthcare industry experience

- AWS and Databricks certification is good to have.

Note :

All data engineer roles should have a foundational set of knowledge in communication, leadership, teamwork, problem solving skills, solution / blueprint definition, business acumen, architectural processes (e.g. blueprinting, reference architecture, governance, etc.), technical standards, project delivery, and industry knowledge.