HamburgerMenu
hirist

Job Description

Job Description :


This role supports Empowers data and AI strategy, with a focus on building Responsible AI capabilities. The Data Engineer will design and implement scalable, ethical, and secure data pipelines and infrastructure that underpin AI/ML systems, ensuring high-quality data flows into model development, testing, monitoring, and governance workflows. The candidate will work across cloud (AWS) and on-premises environments, contributing to the lifecycle of data used for Responsible AI tooling, including bias detection, model transparency, and compliance tracking.


ESSENTIAL FUNCTIONS :


- Design, build, and maintain data pipelines that support model development, testing, and monitoring, with a focus on AI governance and traceability.


- Collaborate with cross-functional teams (including Data Scientists, ML Engineers, and Risk) to understand data needs for AI use cases.


- Integrate data quality, lineage, and metadata tracking into ETL pipelines to support Responsible AI workflows.


- Support ingestion and transformation of structured and unstructured data (including NLP datasets) for AI model training and evaluation.


- Design with compliance in mind: integrate secure handling of PII and support auditability in data flows.


- Participate in technical design discussions focused on enabling transparency, fairness, and explainability in data workflows.


- Troubleshoot and resolve performance and data quality issues in distributed AI pipelines.


- Contribute to reusable libraries or templates to support standardized data practices across AI projects.


QUALIFICATIONS :


- Bachelors Degree in Computer Science, Information Systems, or related field.


- 2 - 6 years of experience in data engineering, preferably in AI/ML environments.


- Strong Python and SQL skills with experience in data pipeline orchestration (e.g., Airflow, Step Functions).


- Experience with Big Data frameworks (e.g., Spark, Hadoop) and streaming data platforms (e.g., Kafka).


- Experience working in AWS environments with services like S3, Glue, Redshift, SageMaker, and Lake Formation.


- Familiarity with machine learning workflows and data requirements (e.g., training/test splits, data versioning, feature stores).


- Experience integrating data validation, data lineage, or metadata tools (e.g., Great Expectations, Apache Atlas, Amundsen).


- Understanding of Responsible AI principles and experience supporting data aspects of fairness, bias, explainability, or model monitoring is a strong plus.


- Experience with JIRA and Agile methodologies.


- Experience in financial services or other highly regulated environments preferred.

info-icon

Did you find something suspicious?