You will get to :
- Design, build, and maintain high-performance data pipelines that integrate large-scale transactional data from our payments platform, ensuring data quality, reliability, and compliance with regulatory requirements.
- Develop and manage distributed data processing pipelines for both high-volume data streams and batch processing workflows in a cloud-native AWS environment.
- Implement observability and monitoring tools to ensure the reliability and scalability of the data platform, enabling stakeholders to make confident, data-driven decisions.
- Collaborate with cross-functional teams to gather requirements and deliver business-critical data solutions, including automation of payment transactions lifecycle management, regulatory reporting, and compliance.
- Design and implement data models across various storage paradigms to support payment transactions at scale while ensuring efficient data ingestion, transformation, and storage.
- Maintain data integrity by implementing robust validation, testing, and error-handling mechanisms within data workflows.
- Ensure that the data platform adheres to the highest standards for security, privacy, and governance.
- Provide mentorship and guidance to junior engineers, driving innovation, best practices, and continuous improvement across the team.
Requirements :
- Proficiency in Python, with hands-on experience using data-focused libraries such as NumPy, Pandas,
SQLAlchemy, and Pandera to build high-quality data pipelines.
- Strong expertise in AWS services (S3, Redshift, Lambda, Glue, Kinesis, etc.) for cloud-based data infrastructure and processing.
- Experience with multiple data storage models, including relational, columnar, and time-series databases.
- Proven ability to design and implement scalable, reliable, and high-performance data workflows, ensuring data integrity, performance, and availability.
- Experience with workflow orchestrators such as Apache Airflow or Argo Workflows for scheduling and automating data pipelines.
- Familiarity with Python-based data stack tools like DBT, Dask, Ray, Modin, and Pandas for distributed data processing.
- Hands-on experience with data ingestion, cataloging, and change-data-capture (CDC) tools.
- Understanding of DataOps and DevSecOps practices to ensure secure and efficient data pipeline development and deployment.
- Strong collaboration, communication, and problem-solving skills, with the ability to work effectively
across multiple teams and geographies.
- Experience in payments or fintech platforms is a strong plus, particularly in processing high volumes of transactional data.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Big Data / Data Warehousing / ETL
Job Code
1545917
Interview Questions for you
View All