HamburgerMenu
hirist

Senior Data Engineer - Spark/Tableau

Xped pvt Ltd
Guntur
5 - 12 Years

Posted on: 01/12/2025

Job Description

JOB DESCRIPTION :

Responsibilities :

- Collaborate with Data Engineers and Data Scientists to integrate and process structured and unstructured data sets into actionable insights.

- Optimize PySpark jobs and data pipelines for performance, scalability, and reliability.

- Conduct regular financial risk assessments to identify potential vulnerabilities in data processing workflows.

- Ensure data quality and integrity throughout all stages of data processing.

- Develop and implement strategies to mitigate financial risks associated with data transformation and aggregation.

- Troubleshoot and debug issues related to data pipelines and processing.

- Ensure compliance with regulatory requirements and industry standards in all data processing activities.

- Implement best practices for data security, compliance, and privacy within Azure environment.

- Document technical specifications, data flows, and solution architecture

- Design and build reusable components, frameworks and libraries at scale to support analytics products

- Design and implement product features in collaboration with business and Technology stakeholders

- Design and develop scalable data pipelines using Azure Databricks and PySpark. - Transform raw data into actionable insights through advanced data engineering techniques.

- Build, deploy, and maintain machine learning models using MLlib, TensorFlow, and MLflow.

- Optimize data integration workflows from Azure Blob Storage, Data Lake, and SQL/NoSQL sources.

- Execute large-scale data processing using Spark Pools, fine-tuning configurations for performance and cost-efficiency.

- Collaborate with data scientists, analysts, and business stakeholders to deliver robust data solutions.

- Maintain and enhance Databricks notebooks and Delta Lake architectures.

- Architect, build, and maintain scalable and reliable data pipelines from diverse data sources.

- Design effective data storage, retrieval mechanisms, and data models to support analytics and business needs.

- Implement data validation, transformation, and quality monitoring processes.

- Collaborate with cross-functional teams to deliver impactful, data-driven solutions.

- Proactively identify bottlenecks and optimize existing workflows and processes. - Provide guidance and mentorship to junior engineers in the team.

- Anticipate, identify and solve issues concerning data management to improve data quality

- Clean, prepare and optimize data at scale for ingestion and consumption

- Drive the implementation of new data management projects and re-structure of the current data architecture

- Implement complex automated workflows and routines using workflow scheduling tools

- Build continuous integration, test-driven development and production deployment frameworks

- Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards

- Analyze and profile data for the purpose of designing scalable solutions

- Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues

- Mentor and develop other data engineers in adopting best practices

Qualifications :

- 3+ years experiencing developing scalable Big Data applications or solutions on distributed platforms

- Able to partner with others in solving complex problems by taking a broad perspective to identify innovative solutions

- Strong skills building positive relationships across Product and Engineering.

- Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

- Able to quickly pick up new programming languages, technologies, and frameworks

- Experience working in Agile and Scrum development process

- Experience working in a fast-paced, results-oriented environment

- Experience in Amazon Web Services (AWS) or other cloud platform tools

- Experience working with Data warehousing tools, including Dynamo DB, SQL, Amazon Redshift, and Snowflake

- Experience architecting data product in Streaming, Serverless and Microservices Architecture and platform.

- Experience working with Data platforms, including EMR, Data Bricks etc

- Experience working with distributed technology tools, including Spark, Presto, Scala, Python, Databricks, Airflow

- Working knowledge of Data warehousing, Data modelling, Governance and Data Architecture

- Working knowledge of Reporting & Analytical tools such as Tableau, Quick site etc.

- Demonstrated experience in learning new technologies and skills

- Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area


info-icon

Did you find something suspicious?