HamburgerMenu
hirist

Job Description

Description :


As a Data Engineer at Egisedge Technologies Pvt Ltd, you will be responsible for designing, developing, and maintaining scalable ETL/ELT data pipelines using Databricks (PySpark) on Azure/AWS/GCP.


Key Responsibilities :


- Design, develop, and maintain scalable ETL/ELT data pipelines using Databricks (PySpark) on Azure/AWS/GCP.


- Develop clean, reusable, and performant Python code for data ingestion, transformation, and quality checks.


- Write efficient and optimized SQL queries for querying structured and semi-structured data.


- Work with stakeholders to understand data requirements and implement end-to-end data workflows.


- Perform data profiling, validation, and ensure data quality and integrity.


- Optimize data pipelines for performance, reliability, and integrate data from various sources APIs, flat files, databases, cloud storage e.g, S3, ADLS.


- Build and maintain delta tables using Delta Lake format for ACID-compliant streaming and batch pipelines.


- Work with Databricks Workflows to orchestrate pipelines and scheduled jobs.


- Collaborate with DevOps and cloud teams to ensure secure, scalable, and compliant infrastructure.


Technical Skills Required :


Core Technologies :


- Databricks Spark on Databricks, Delta Lake, Unity Catalog


- Python with strong knowledge of PySpark


- SQL Advanced level joins, window functions, CTEs, aggregation


ETL & Orchestration :


- Databricks Workflows / Jobs


- Airflow, Azure Data Factory, or similar orchestration tools


- AUTO LOADER Structured Streaming preferred


Cloud Platforms Any one or more :


- Azure Databricks on Azure, ADLS, ADF, Synapse


- AWS Databricks on AWS, S3, Glue, Redshift


- GCP Dataproc, BigQuery, GCS


Data Modeling & Storage :


- Experience working with Delta Lake, Parquet, Avro


- Understanding of dimensional modeling, data lakes, and lakehouse architectures


Monitoring & Version Control :


- CI/CD pipelines for Databricks via Git, Azure DevOps, or Jenkins


- Logging, debugging, and monitoring with tools like Datadog, Prometheus, or Cloud-native tools


Optional/Preferred :


- Knowledge of MLflow, Feature Store, or MLOps workflows


- Experience with REST APIs, JSON, and data ingestion from 3rd-party services


- Familiarity with DBT Data Build Tool or Great Expectations for data quality


Soft Skills :


- Strong analytical, problem-solving, and debugging skills


- CLEAR communication and documentation skills


- Ability to work independently and within cross-functional teams


- Agile/Scrum working experience


info-icon

Did you find something suspicious?