Posted on: 06/11/2025
Description :
Key Responsibilities :
- Design, develop, and optimize large-scale data pipelines using PySpark and Scala.
- Implement ETL processes on big data platforms such as Hadoop and Azure Data Lake.
- Work with Azure services like Azure Databricks, Azure Data Factory, Azure Synapse Analytics, and Azure Blob Storage.
- Develop, test, and maintain data ingestion and transformation frameworks using Python and Spark.
- Collaborate with cross-functional teams to integrate data from multiple sources and ensure high data quality.
- Implement data governance, security, and performance tuning best practices.
- Troubleshoot and optimize data workflows for scalability and efficiency.
Mandatory Skills :
- PySpark and Scala programming
- Hadoop ecosystem (HDFS, Hive, HBase, etc.)
- Python for data processing and automation
- Azure Cloud (Databricks, ADF, Synapse, ADLS)
- Strong understanding of ETL, data modeling, and data warehousing concepts
Good to Have :
- Experience with Kafka or Event Hub for streaming data
- Knowledge of SQL and NoSQL databases
- Familiarity with CI/CD pipelines and DevOps tools
- Exposure to Delta Lake and Lakehouse architecture
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1570496
Interview Questions for you
View All