Posted on: 30/01/2026
JOB DESCRIPTION :
Responsibilities :
- Collaborate with Data Engineers and Data Scientists to integrate and process structured and unstructured data sets into actionable insights.
- Optimize PySpark jobs and data pipelines for performance, scalability, and reliability.
- Conduct regular financial risk assessments to identify potential vulnerabilities in data processing workflows.
- Ensure data quality and integrity throughout all stages of data processing.
- Develop and implement strategies to mitigate financial risks associated with data transformation and aggregation.
- Troubleshoot and debug issues related to data pipelines and processing.
- Ensure compliance with regulatory requirements and industry standards in all data processing activities.
- Implement best practices for data security, compliance, and privacy within Azure environment.
- Document technical specifications, data flows, and solution architecture
- Design and build reusable components, frameworks and libraries at scale to support analytics products
- Design and implement product features in collaboration with business and Technology stakeholders
- Design and develop scalable data pipelines using Azure Databricks and PySpark. - Transform raw data into actionable insights through advanced data engineering techniques.
- Build, deploy, and maintain machine learning models using MLlib, TensorFlow, and MLflow.
- Optimize data integration workflows from Azure Blob Storage, Data Lake, and SQL/NoSQL sources.
- Execute large-scale data processing using Spark Pools, fine-tuning configurations for performance and cost-efficiency.
- Collaborate with data scientists, analysts, and business stakeholders to deliver robust data solutions.
- Maintain and enhance Databricks notebooks and Delta Lake architectures.
- Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.
- Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.
- Implement data validation, transformation, and quality monitoring processes.
- Collaborate with cross-functional teams to deliver impactful, data-driven solutions.
- Proactively identify bottlenecks and optimize existing workflows and processes. - Provide guidance and mentorship to junior engineers in the team.
- Anticipate, identify and solve issues concerning data management to improve data quality
- Clean, prepare and optimize data at scale for ingestion and consumption
- Drive the implementation of new data management projects and re-structure of the current data architecture
- Implement complex automated workflows and routines using workflow scheduling tools
- Build continuous integration, test-driven development and production deployment frameworks
- Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
- Analyze and profile data for the purpose of designing scalable solutions
- Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues
- Mentor and develop other data engineers in adopting best practices
Qualifications :
- 3+ years experiencing developing scalable Big Data applications or solutions on distributed platforms
- Able to partner with others in solving complex problems by taking a broad perspective to identify innovative solutions
- Strong skills building positive relationships across Product and Engineering.
- Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
- Able to quickly pick up new programming languages, technologies, and frameworks
- Experience working in Agile and Scrum development process
- Experience working in a fast-paced, results-oriented environment
- Experience in Amazon Web Services (AWS) or other cloud platform tools
- Experience working with Data warehousing tools, including Dynamo DB, SQL, Amazon Redshift, and Snowflake
- Experience architecting data product in Streaming, Serverless and Microservices Architecture and platform.
- Experience working with Data platforms, including EMR, Data Bricks etc
- Experience working with distributed technology tools, including Spark, Presto, Scala, Python, Databricks, Airflow
- Working knowledge of Data warehousing, Data modelling, Governance and Data Architecture
- Working knowledge of Reporting & Analytical tools such as Tableau, Quick site etc.
- Demonstrated experience in learning new technologies and skills
- Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1607680