Posted on: 22/12/2025
Location : Pune / Bangalore
Experience : 5- 9 Years
Role Summary :
The Senior QA/Data Tester is responsible for ensuring the technical integrity, accuracy, and performance of enterprise-scale data pipelines. This role goes beyond traditional functional testing, requiring a deep understanding of Big Data architectures to validate complex ETL/ELT processes.
You will be the gatekeeper of data quality, working closely with Data Engineers to verify Spark applications, Hive schemas, and large-scale data movements across Hadoop and object storage environments.
Responsibilities :
- Design and execute comprehensive data validation strategies for complex ETL/ELT pipelines, ensuring 100% data accuracy from source to target.
- Develop and maintain automated testing frameworks using Python and Shell scripting to validate large datasets in Hadoop and Object Storage environments.
- Perform advanced SQL validation (Oracle/Hive) to verify data transformations, aggregations, and business logic across distributed databases.
- Validate Spark/Scala applications by performing unit, integration, and system testing on large-scale distributed workloads.
- Conduct end-to-end testing of data flows orchestrated via Apache NiFi, ensuring proper data routing, transformation, and error handling.
- Analyze complex data structures and schemas to identify data anomalies, missing records, and performance bottlenecks within the pipeline.
- Implement data reconciliation scripts to verify parity between legacy Data Warehouses and modern Big Data lakes.
- Collaborate with Agile delivery teams to define "Definition of Done" for data features and ensure quality is integrated into the CI/CD pipeline.
- Document detailed test plans, test cases, and defect reports, providing clear root-cause analysis for data-related failures.
- Manage test data environments and generate synthetic datasets to simulate high-volume production scenarios.
- Coordinate with cross-functional stakeholders to manage environment dependencies and ensure seamless deployment across SIT and UAT environments.
Technical Requirements :
- 5- 9 years of dedicated QA experience with a specialized focus on Data Warehousing, ETL, or Big Data Engineering projects.
- Expert-level proficiency in SQL, with the ability to write complex joins, subqueries, and analytical functions in Oracle or Hive.
- Hands-on experience testing Spark applications (Spark Core, Spark SQL) running on Hadoop or Cloud Object Storage.
- Proficiency in Python and Unix/Shell scripting for automating repetitive testing tasks and building custom validation tools.
- Strong understanding of Big Data components such as HDFS, YARN, and Hive Metastore.
- Solid experience working in an Agile/Scrum environment, utilizing tools like JIRA and ALM for defect tracking and sprint management.
- Proven ability to perform white-box testing by reviewing code logic in Scala or Python to identify potential failure points.
Preferred Skills :
- Hands-on experience with Apache NiFi for validating real-time data ingestion and flow management.
- Familiarity with Data Quality (DQ) tools like Great Expectations, Deequ, or Informatica DVO.
- Experience with CI/CD tools (Jenkins/GitLab) and integrating automated data tests into deployment pipelines.
- Knowledge of NoSQL databases (HBase, Cassandra, or MongoDB) and their respective testing methodologies.
- Exposure to cloud data platforms (AWS Glue, EMR, or Snowflake) is a significant plus.
- Understanding of Data Privacy and Security testing, including data masking and PII validation.
- Strong analytical mindset with a proactive approach to identifying edge cases in massive datasets.
Did you find something suspicious?
Posted by
Posted in
Quality Assurance
Functional Area
QA & Testing
Job Code
1593848
Interview Questions for you
View All