HamburgerMenu
hirist

Kaleidofin - Lead Data Engineer - ETL/PySpark

Posted on: 28/09/2025

Job Description

What youll do ?

- You will take on the role of technical expert and a lead member of our Engineering team.

- Designing and building data pipelines from data ingestion to consumption within a hybrid big data architecture, using Cloud Native AWS, NOSQL, SQL etc.

- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.

- Designing, building, and operationalizing large-scale enterprise data solutions using Hadoop based technologies along with AWS, Spark, Hive, & Data Lake, using Hive, PySpark and Python Programming.

- Develop efficient software code for multiple use cases leveraging Python and Big Data technologies for various use cases built on the platform.

- Responsible to Ingest data from files, streams and databases and process the data with Hadoop,Scala, SQL/NOSQL Database, Spark, ML, IoT.

- Develop programs in Scala and Python as part of data cleaning and processing.

- It includes data modelling, data ingestion, transformation, data consumption patterns and optimizing complex queries, creating efficient UDFs to extend the functionalities.

- You will be involved in product feature development and will be working in close partnership with other engineering teams.

- You will be responsible for mentoring other team members and ensuring high availability platform and stability for batch and stream processing systems.

Who you need to be ?

- Experience - Min. 6 Years.

- Bachelor's degree in computer science or computing-related discipline from premier institutes.

- The candidate should have data processing ability (ETL techniques) using have scripting experience.

- Native Data engineer with 3 to 8 years of hands-on experience with SQL and NOSQL databases such as HBase, Cassandra or MongoDB.

- Extensive 2+ years experience working with Hadoop and related processing frameworks such as

Spark, Hive, Kafka etc.

- Experience working with REST and SOAP based APIs to extract data for data pipelines.

- Experience with cloud-native ETL languages/frameworks (Scala, Python, Databricks, AWS Glue).

- Experience in working with Real time data streams and Kafka Platform.

- Working knowledge with workflow orchestration tools like Apache Airflow design and deploy DAGs.

- Location : Bangalore.


info-icon

Did you find something suspicious?