HamburgerMenu
hirist

Lead II - Data Engineering

Worksconsultancy
Multiple Locations
7 - 9 Years

Posted on: 03/12/2025

Job Description

Responsibilities :


- 9 years of experience in the development of Data Lakes with Data Ingestion from disparate data sources, including relational databases, flat files, APIs, and streaming data.

- Experience in providing Design and development of Data Platforms and data ingestion from disparate data sources into the cloud.

- Expertise in core AWS Services including AWS IAM, VPC, EC2, EKS/ECS, S3, RDS, DMS, Lambda, CloudWatch, CloudFormation, CloudTrail, CloudWatch.

- Proficiency in programming languages like Python and PySpark to ensure efficient data processing. preferably Python.

- Architect and implement robust ETL pipelines using AWS Glue, defining data extraction methods, transformation logic, and data loading procedures across different data sources

- 10 years of Experience in using IaC tools like Terraform etc. 8. 10 years of experience in development of CI/CD pipelines (GitHub Actions, Jenkins).

- Experience in the development of Event-Driven Distributed Systems in the Cloud using Serverless Architecture.

- Ability to work with Infrastructure team for AWS service provisioning for databases, services, network design, IAM roles and AWS cluster.

- 2-3 years of experience working with Document DB.

- Ability to design, orchestrate and schedule jobs using Airflow.


- Knowledge of AWS AI Services like AWS Entity Resolution, AWS Comprehend.

- Ability to run custom LLMs using Amazon SageMaker.


- Ability to use Large Language Models (LLMs) for Data Classification and Identification of PII data entities

Nice to have Skills :


- 9 years of experience in the development of Data Audit, Compliance and Retention standards for Data Governance, and automation of the governance processes.

- Experience in data modelling with NoSQL Databases like Document DB.

- Experience in using column-oriented data file format like Apache Parquet, and Apache Iceberg as the table format for analytical datasets.

- Expertise in development of Retrieval-Augmented Generation (RAG) and Agentic Workflows for providing context to LLMs based on proprietary enterprise data.

- Ability to develop re-ranking strategies using results from Index and Vector stores for LLMs to improve the quality of the output.

Skills :Data Lake, AWS, Python


Notice Period : Immediate - 30days.


info-icon

Did you find something suspicious?