Job Title : Data engineer

Location : Hyderabad

Work mode : Hybrid

Office Timings : 11am to 8pm IST

Rounds of interview - 2( F2F)

Core technical expertise required for this role :

- AWS PySpark : Strong hands-on experience using PySpark within AWS environments to process large-scale datasets efficiently.

- AWS Glue : Experience building, maintaining, and optimizing AWS Glue jobs for data extraction, transformation, and loading.

- AWS S3 : Proficient in working with Amazon S3 for data storage, data lake architecture, and integration with analytics pipelines.

- PySpark : Ability to write optimized PySpark code for distributed data processing and transformation.

- ETL Frameworks : Experience designing, developing, and maintaining scalable ETL frameworks for batch and streaming data pipelines.

Skills that will provide additional value to the role :

- Knowledge on Talend Cloud ETL : Familiarity with Talend Cloud for building and orchestrating ETL pipelines.

- Kafka : Understanding of event-driven architectures and streaming data platforms.

- Snowflake Cloud : Experience working with Snowflake for cloud-based data warehousing and analytics.

- PowerBI : Exposure to data visualization and reporting using PowerBI.

Qualification :

(Educational background and professional experience requirements)

- Bachelor's or Master's Degree in Computer Science or equivalent experience

- Formal education in computer science or related disciplines, or equivalent hands-on industry experience.

- At least 3+ years of experience in application design, development and analysis

- Proven experience in designing, developing, and analyzing data-driven applications.

- Experience in AWS Cloud Solutions. Retail Industry experience preferred.

- Hands-on experience designing and implementing solutions on AWS, with exposure to retail domain use cases being an advantage.

Key Responsibilities :

(What the role involves on a day-to-day basis)

- Process data using spark (PySpark) : Develop and manage Spark-based data processing pipelines to handle large volumes of structured and unstructured data.

- Collaborate and work with data analysts in various functions to ensure that data meets their reporting and analysis needs.

- Work closely with business and analytics teams to deliver high-quality, reliable datasets.

- Experience in creating Various ETL Frameworks in processing or extracting the data from cloud databases by using AWS Services by leveraging Lambda, Glue, PySpark, Step Functions, SNS, SQS and Batch.

- Design end-to-end ETL frameworks using a wide range of AWS services to support scalable and automated data workflows.

- Proven ability to be a strategic thinker to drive the necessary ownership and data governance within the organization

- Demonstrate ownership of data pipelines and contribute to governance, best practices, and long-term architectural decisions.

- Should have sound knowledge in various AWS Services

- Strong understanding of AWS services and how they integrate to build robust data platforms.

- Able to understand the existing Frameworks built in AWS PySpark and Glue

- Quickly ramp up on existing solutions, analyze current implementations, and ensure continuity.

- Able to scan ETL Frameworks and propose optimizations and cost savings.

- Identify performance bottlenecks, improve efficiency, and recommend cost-optimization strategies