Posted on: 02/12/2025
Responsibilities :
- Experience in providing Design and development of Data Platforms and data ingestion from disparate data sources into the cloud.
- Expertise in core AWS Services including AWS IAM, VPC, EC2, EKS/ECS, S3, RDS, DMS, Lambda, CloudWatch, CloudFormation, CloudTrail, CloudWatch.
- Proficiency in programming languages like Python and PySpark to ensure efficient data processing. preferably Python.
- Architect and implement robust ETL pipelines using AWS Glue, defining data extraction methods, transformation logic, and data loading procedures across different data sources
- 10 years of Experience in using IaC tools like Terraform etc. 8. 10 years of experience in development of CI/CD pipelines (GitHub Actions, Jenkins).
- Experience in the development of Event-Driven Distributed Systems in the Cloud using Serverless Architecture.
- Ability to work with Infrastructure team for AWS service provisioning for databases, services, network design, IAM roles and AWS cluster.
- 2-3 years of experience working with Document DB.
- Ability to design, orchestrate and schedule jobs using Airflow.
- Knowledge of AWS AI Services like AWS Entity Resolution, AWS Comprehend.
- Ability to run custom LLMs using Amazon SageMaker.
- Ability to use Large Language Models (LLMs) for Data Classification and Identification of PII data entities
Nice to have Skills :
- Experience in data modelling with NoSQL Databases like Document DB.
- Experience in using column-oriented data file format like Apache Parquet, and Apache Iceberg as the table format for analytical datasets.
- Expertise in development of Retrieval-Augmented Generation (RAG) and Agentic Workflows for providing context to LLMs based on proprietary enterprise data.
- Ability to develop re-ranking strategies using results from Index and Vector stores for LLMs to improve the quality of the output.
Skills :Data Lake, AWS, Python
Notice Period : Immediate - 30days.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1583549
Interview Questions for you
View All