Posted on: 15/08/2025
Responsibilities :
- Hands on experience in data related activities such as data parsing, cleansing quality definition data pipelines, storage and ETL scripts.
- Expert knowledge in AWS Data Lake implementation and support (S3, Glue, DMS Athena, Lambda, API Gateway, Redshift).
- Experiences in programming language Python/PySpark/SQL.
- Experience with data migration with hands-on experience.
- Experiences in consuming rest API using various authentication options with in AWS.
- Lambda architecture.
- Orchestrate triggers, debug and schedule batch job using a AWS Glue, Lambda and step functions.
- Hands-on experience with Redshift, including data models, storage, and writing effective queries.
- Understanding of AWS security features such as IAM roles and policies.
- Knowledge of the Devops tools and CI/CD process.
- AWS certification in AWS is highly preferred.
Technical Skills & Expertise :
Cloud (AWS) Expertise :
- S3 (Simple Storage Service) : Data storage and partitioning, lifecycle policies, data replication, encryption at rest and transit, and data versioning, Data archive and Data sharing mechanisms.
- Lambda : Creation of Lambdas, configuring event-driven functions, monitoring, and integration with other AWS services of s3,Glue,API Gateway, Redshift etc.
- Glue : Creating & Managing ETL pipelines, Glue crawlers, job scheduling, and integration with S3, Redshift, and Athena.
- Redshift : Creating Tables, Views and Stored procedures, parameter tuning, workload management, configuring triggers, Redshift Spectrum usage for querying S3 data.
- API Gateway : Designing Restful APIs, securing endpoints using IAM or Cognito, throttling and logging API usage.
- VPC (Virtual Private Cloud) : Aware of existing VPC design, subnets, NAT gateways, peering, routing, and network ACLs for services creation.
- ELB (Elastic Load Balancer) : Configuring ALB/NLB for load distribution, health checks, sticky sessions, and SSL termination.
- CloudTrail : Enabling auditing and compliance logging, managing trails, integrating with CloudWatch and third-party SIEM tools.
- SageMaker : Knowledge about Sagemaker ML model training and deployment workflows, managing notebooks, endpoints, and model versioning.
CI/CD & DevOps Tools :
- GitHub / GitHub Actions : Managing version control, branching strategies, and automating workflows for testing and deployment.
- Jenkins / CloudBees : Building pipelines for build-test-deploy automation, plugin management, parallel execution, and integrations with SCM and artifact repositories.
- SonarQube : Static code analysis, security vulnerability checks, technical debt reporting, integration into CI/CD.
DevOps/MLOps & AI/ML Awareness :
- Understanding of ML model lifecycle : data preprocessing, training, evaluation, deployment, and monitoring.
- Experience supporting Data Scientists and ML Engineers in deploying models to production.
- Familiarity with tools and workflows like SageMaker, MLflow, Airflow (optional), and pipelines integration.
Technical Skills :
AWS Data Lake, Python/PySpark, SQL, AWS Lambda, Redshift, AWS Glue, Data Migration, IAM Security, CI/CD, AWS Certification.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1530394
Interview Questions for you
View All