Posted on: 29/07/2025
Job Description :
This role supports Empowers data and AI strategy, with a focus on building Responsible AI capabilities. The Data Engineer will design and implement scalable, ethical, and secure data pipelines and infrastructure that underpin AI/ML systems, ensuring high-quality data flows into model development, testing, monitoring, and governance workflows. The candidate will work across cloud (AWS) and on-premises environments, contributing to the lifecycle of data used for Responsible AI tooling, including bias detection, model transparency, and compliance tracking.
ESSENTIAL FUNCTIONS :
- Design, build, and maintain data pipelines that support model development, testing, and monitoring, with a focus on AI governance and traceability.
- Collaborate with cross-functional teams (including Data Scientists, ML Engineers, and Risk) to understand data needs for AI use cases.
- Integrate data quality, lineage, and metadata tracking into ETL pipelines to support Responsible AI workflows.
- Support ingestion and transformation of structured and unstructured data (including NLP datasets) for AI model training and evaluation.
- Design with compliance in mind: integrate secure handling of PII and support auditability in data flows.
- Participate in technical design discussions focused on enabling transparency, fairness, and explainability in data workflows.
- Troubleshoot and resolve performance and data quality issues in distributed AI pipelines.
- Contribute to reusable libraries or templates to support standardized data practices across AI projects.
QUALIFICATIONS :
- Bachelors Degree in Computer Science, Information Systems, or related field.
- 2 - 6 years of experience in data engineering, preferably in AI/ML environments.
- Strong Python and SQL skills with experience in data pipeline orchestration (e.g., Airflow, Step Functions).
- Experience with Big Data frameworks (e.g., Spark, Hadoop) and streaming data platforms (e.g., Kafka).
- Experience working in AWS environments with services like S3, Glue, Redshift, SageMaker, and Lake Formation.
- Familiarity with machine learning workflows and data requirements (e.g., training/test splits, data versioning, feature stores).
- Experience integrating data validation, data lineage, or metadata tools (e.g., Great Expectations, Apache Atlas, Amundsen).
- Understanding of Responsible AI principles and experience supporting data aspects of fairness, bias, explainability, or model monitoring is a strong plus.
- Experience with JIRA and Agile methodologies.
- Experience in financial services or other highly regulated environments preferred.
Did you find something suspicious?
Posted By
Suganth R
Talent Acquisition consultant at Great West Global Business Services India Pvt. Ltd
Last Active: 6 Aug 2025
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1521535
Interview Questions for you
View All