HamburgerMenu
hirist

Job Description

Job Summary :


We are seeking a skilled Data Quality Engineer to ensure the accuracy, reliability, and integrity of our data pipelines and workflows.


The ideal candidate will have hands-on experience in data engineering concepts, with a strong focus on quality testing, validation, and pipeline orchestration.


Key Responsibilities :


- Design, develop, and execute data quality test cases to validate data pipelines and ETL/ELT processes.


- Monitor and trigger data pipelines, ensuring smooth execution and timely data delivery.


- Run and maintain data quality scripts to identify anomalies, inconsistencies, and data integrity issues.


- Perform data profiling and validation across multiple data sources and targets.


- Collaborate with data engineers to implement data quality checks at various stages of the pipeline.


- Perform root cause analysis (RCA) for data anomalies and pipeline failures.


- Troubleshoot pipeline failures and data quality issues, working to resolve them efficiently.


- Document data quality standards, testing procedures, and validation results.


- Generate data quality reports and communicate findings with engineering teams.


- Develop automated testing frameworks to improve data quality validation efficiency.


- Focus primarily on validating and assuring quality of existing pipelines (not building full pipelines).


Required Technical Skills :


- Strong understanding of data engineering concepts including ETL/ELT processes, data warehousing, and data modeling.


- Proficiency in SQL for complex data validation and querying.


- Experience with scripting languages such as Python or Shell scripting for automation.


- Hands-on experience with data pipeline orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Glue).


- Knowledge of data quality frameworks and tools (e.g., Great Expectations, Deequ, custom validation scripts).


- Familiarity with cloud platforms (AWS, Azure, or GCP) and their data services.


- Understanding of data formats (JSON, Parquet, Avro, CSV) and data storage systems.


- Exposure to logging/monitoring tools (CloudWatch, Datadog, ELK, etc.) is a plus.


Preferred Skills :


- Experience with big data technologies (Spark, Hadoop, Kafka).


- Knowledge of CI/CD practices for data pipelines.


- Familiarity with version control systems (Git).


- Understanding of data governance and compliance requirements.


- Experience with data visualization tools for quality reporting.


info-icon

Did you find something suspicious?