HamburgerMenu
hirist

Job Description

Description :

Key Responsibilities :


- Design, develop, and optimize large-scale data pipelines using PySpark and Scala.

- Implement ETL processes on big data platforms such as Hadoop and Azure Data Lake.

- Work with Azure services like Azure Databricks, Azure Data Factory, Azure Synapse Analytics, and Azure Blob Storage.

- Develop, test, and maintain data ingestion and transformation frameworks using Python and Spark.

- Collaborate with cross-functional teams to integrate data from multiple sources and ensure high data quality.

- Implement data governance, security, and performance tuning best practices.

- Troubleshoot and optimize data workflows for scalability and efficiency.

Mandatory Skills :


- PySpark and Scala programming

- Hadoop ecosystem (HDFS, Hive, HBase, etc.)

- Python for data processing and automation

- Azure Cloud (Databricks, ADF, Synapse, ADLS)

- Strong understanding of ETL, data modeling, and data warehousing concepts

Good to Have :


- Experience with Kafka or Event Hub for streaming data

- Knowledge of SQL and NoSQL databases

- Familiarity with CI/CD pipelines and DevOps tools

- Exposure to Delta Lake and Lakehouse architecture


info-icon

Did you find something suspicious?