Posted on: 30/10/2025
About The Job :
- Develops technical tools and programming to cleanse, organize and transform data and to maintain, protect and update data structures and integrity on an automated basis.
- Applies data extraction, transformation, and loading techniques in order to tie together large data sets from a variety of sources.
- Partners with both internal & external sources to design, build and oversee the deployment and operation of technology architecture, solutions and software.
- Designs, develops and programs methods, processes and systems to capture, manage, store and utilize structured and unstructured data to generate actionable insights and solutions.
- Responsible for the maintenance, improvement, cleaning, and manipulation of data in the business client's operational and analytics databases.
- Proactively analyzes and evaluates the business client's databases in order to identify and recommend improvements and optimization.
Essential Job Functions :
- Uses knowledge of existing and emerging data science engineering principles, theories, and techniques to inform business decisions; and produce accurate business insights.
- Completes projects and assignments of moderate scope and complexity under normal supervision to ensure customer and business needs are met.
- Applies discretion and independent judgement to interpret data trends and summarize data insights.
- Assists in the preliminary data exploration, data preparation for accurate model development.
- Establishes working relationships with others outside area of Data Science Engineering expertise.
- Prepares presentations of project outputs for external customers with assistance.
- Design, develop, and maintain scalable data pipelines and systems for data processing.
- Utilize Data Lakehouse, Spark on Kubernetes and related technologies to manage large-scale data processing.
- Perform data ingestion from various sources like API's, RDBMS, NoSQL DB's, Kafka, Middleware & Files using Spark and process data into Lakehouse platform.
- Develop and maintain py-spark scripts for automation of data processing tasks.
- Implement full and incremental data loading strategies to ensure data consistency and availability.
- Orchestrate and monitor workflows using Apache Airflow.
- Ensure code quality and version control using GIT.
- Troubleshoot and resolve data-related issues in a timely manner.
- Stay up-to-date with the latest industry trends and technologies to continuously improve our data infrastructure.
- Design, develop, and maintain scalable data pipelines and systems for data processing.
- Utilize Data Lakehouse, Spark on Kubernetes and related technologies to manage large-scale data processing.
- Perform data ingestion from various sources like API's, RDBMS, NoSQL DB's, Kafka, Middleware &
- Files using Spark and process data into Lakehouse platform.
- Develop and maintain py-spark scripts for automation of data processing tasks.
- Implement full and incremental data loading strategies to ensure data consistency and availability.
- Orchestrate and monitor workflows using Apache Airflow.
- Ensure code quality and version control using GIT.
- Troubleshoot and resolve data-related issues in a timely manner.
- Stay up-to-date with the latest industry trends and technologies to continuously improve our data infrastructure.
Qualifications :
- Proven experience as a Data Engineer (ETL, data warehousing, data Lakehouse).
- Strong knowledge of Spark on Kubernetes, S3 and Docker Images.
- Proficiency in Data engineering techniques with Py-spark.
- Strong experience in Data warehousing techniques like data mining, data analysis, data profiling.
- Experience with Python scripting for automation.
- Expertise in full and incremental data loading techniques.
- Excellent problem-solving skills and attention to detail.
- Ability to work collaboratively in a team environment and communicate effectively with stakeholders.
Good to have :
- Understanding of streaming data applications using.
- Hands-on experience with Apache Airflow for workflow orchestration.
- Proficiency with GIT for version control
- Understanding of data engineering integration with LLMs or GEN-AI applications and Vector DB.
- Knowledge on Shell scripting Postgres SQL or SQL server or MSBI.
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1566889