HamburgerMenu
hirist

Databricks Data Architect - Python/SQL/ETL

Sheryl strategic solutions Pvt. LTD .
Multiple Locations
3 - 6 Years

Posted on: 13/01/2026

Job Description

Description :

- Databricks Data Architect

- Experience : 3+ Years in Data Architecture

- Location : Remote Offshore India

- Duration : 6+ Months (Contract)

Role Summary :

We are seeking a highly technical Databricks Data Architect to design and implement robust, scalable data platforms using the Databricks Lakehouse architecture. In this role, you will be responsible for the end-to-end design of high-performance data pipelines, focusing on the Medallion Architecture (Bronze, Silver, Gold) and Unity Catalog for governance.


You will leverage Delta Live Tables (DLT) and automated workflows to ensure data reliability and partition security. This position requires a deep proficiency in Python and SQL to handle complex transformations, alongside experience in CI/CD and Infrastructure as Code (Terraform). You will play a critical role in integrating Change Data Capture (CDC) patterns and building securely partitioned, multi-tenant data products for diverse domains, ensuring a seamless flow from raw cloud storage to business-ready analytics.

Responsibilities :

- Lakehouse Architecture Design : Architect and govern modern data platforms using the Databricks Lakehouse paradigm, implementing the Medallion Architecture for optimized data processing.

- Unity Catalog Governance : Implement and manage Unity Catalog to ensure centralized table-level governance, fine-grained access control, and robust security models across the workspace.

- Pipeline Orchestration : Design and deploy automated data pipelines using Delta Live Tables (DLT) and Databricks Workflows to facilitate reliable ETL/ELT processes.

- Incremental Data Ingestion : Develop and optimize Change Data Capture (CDC) patterns, utilizing SQL Server Change Tracking or timestamp-based incremental ingestion strategies.

- Data Modeling & Schema Design : Lead the design of domain-driven data models and multi-tenant architectures, ensuring secure data partitioning and schema evolution.

- Infrastructure as Code (IaC) : Utilize Terraform to automate the provisioning and configuration of Databricks workspaces, clusters, and governance structures.

- Cloud Storage Integration : Manage and optimize data interactions with cloud storage technologies such as Azure Data Lake Storage (ADLS), AWS S3, or Google Cloud Storage.

- Advanced Data Transformation : Author high-performance transformation logic using PySpark (Python) and Spark SQL, focusing on performance tuning and cost optimization.

- CI/CD & DevOps : Implement Git-based workflows and CI/CD pipelines to automate the testing and deployment of data assets and infrastructure configurations.

- Analytics Ecosystem Alignment : Collaborate with stakeholders to ensure data products are optimized for downstream analytics tools like Power BI, Tableau, or Business Objects.

Technical Requirements :

- Databricks Mastery : 3+ years of experience in data architecture with hands-on expertise in Delta Lake, DLT, and Unity Catalog.

- Programming Proficiency : Strong coding skills in Python and SQL for enterprise-scale data transformation.

- Architectural Frameworks : Solid understanding of ETL/ELT patterns, medallion architecture, and star/snowflake schema designs.

- Cloud Infrastructure : Practical knowledge of cloud storage (ADLS/S3) and cloud-native security models.

- DevOps Foundation : Experience with Git-based workflows and CI/CD best practices for data engineering.

- Change Data Capture : Knowledge of CDC patterns (SQL Server Change Tracking or timestamp-based) for incremental data synchronization.

Preferred Skills :

- Infrastructure Automation : Experience using Terraform for Databricks infrastructure governance and resource scaling.

- Domain Expertise : Background in working with education data models (e.g., CEDS), ERP systems, or operational domain-driven designs.

- Multi-Tenancy : Experience architecting securely partitioned data products for multi-tenant environments.

- Analytics Integration : Understanding of how data modeling affects performance in visualization tools like Power BI.

- Problem Solving : Ability to deconstruct complex business requirements into scalable, autonomous data pipelines.


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in