Description :

Job Title : Data Engineer (Databricks | Snowflake | Python | SQL)

Key Responsibilities :

- Hands-on experience in data engineering activities such as data ingestion, transformation, cleansing, quality validation, and building scalable ETL/ELT pipelines.

- Strong experience working with Databricks (Delta Lake, Spark clusters, notebooks, jobs, workflows).

- Expertise in building PySpark pipelines for large-scale distributed data processing.

- Solid experience implementing data warehouse solutions using Snowflake, including schema design, performance optimization, Snowflake SQL, and Snowpipe.

- Experience with data migration projects involving legacy systems to Databricks/Snowflake.

- Experience consuming REST APIs using secure authentication methods (OAuth, IAM roles, service principals).

- Ability to orchestrate and automate jobs using Databricks Workflows, Airflow, or similar orchestration tools.

- Hands-on experience creating Delta Lake tables, managing schema evolution, time travel, and optimizing storage.

- Strong understanding of Snowflake features such as micro-partitioning, time travel, data sharing, and RBAC security.

- Working knowledge of Azure cloud services for storage, compute, IAM, and networking.

- Good understanding of CI/CD pipelines for data engineering deployment.

- Snowflake or Databricks Certification is highly preferred.

Technical Skills & Expertise :

Databricks Expertise :

- Working with Databricks Runtime, cluster configuration, autoscaling, and optimization.

- Creating and managing Delta Lake tables, Delta Live Tables, and implementing CDC pipelines.

- Configuring Databricks Jobs/Workflows, REST API integrations, and Git-based version control.

- Implementing advanced PySpark transformations, performance tuning, and caching strategies.

- Integration of Databricks with Azure Data Lake Storage (ADLS) and Snowflake connectors.

Snowflake Expertise :

- Designing data warehouse schemas, building tables, views, materialized views, and stored procedures.

- Query tuning, clustering keys, result caching, and warehouse performance optimization.

- Implementing Snowpipe, Streams & Tasks for automated ingestion and CDC workflows.

- Managing access controls, roles, masking policies, and secure data sharing.

- Integrating Snowflake with Databricks, cloud services, and external APIs.

Azure Expertise :

- ADLS : data lifecycle, encryption, replication, versioning.

- Azure Compute : usage for data workloads.

- Azure IAM : Managing roles, service principals, and access control.

- Networking : VNets, subnets, routing, private endpoints.

CI/CD & DevOps Tools :

- GitHub / GitHub Actions : version control, branching, automated deployment.

- Jenkins / Azure DevOps : building pipelines for testing and data pipeline deployment.

- SonarQube : static code analysis, security checks, and CI integration.

DevOps/MLOps & AI/ML Awareness :

- Good understanding of ML lifecycle : data preparation, training, deployment, and model monitoring.

- Experience supporting Data Scientists in deploying notebooks/models to production using Databricks ML or Snowflake Snowpark.

- Familiarity with tools like MLflow, Databricks Model Registry, Airflow, and orchestration frameworks.