About the Role :

We are looking for a passionate and experienced Data Engineer to join our growing data engineering team.

In this role, you will be responsible for designing, constructing, installing, and maintaining scalable data pipelines and architectures.

You will collaborate with cross-functional teams including data scientists, analysts, and application developers to ensure that data flows seamlessly, is accurate, and readily available for reporting and analytics.

You will play a key role in transforming raw data from multiple sources into clean, validated, and enriched datasets that are optimized for both operational and analytical use cases.

You are expected to have a deep understanding of data warehousing, ETL pipelines, and big data tools, and should be comfortable working in a fast-paced environment.

Key Responsibilities :

- Design and build scalable and robust data pipelines that ingest, transform, and store structured and unstructured data from various internal and external sources

- Develop and maintain optimal data architectures that support data extraction, transformation, and loading (ETL/ELT) processes

- Work closely with business teams to gather data requirements and translate them into effective technical solutions

- Ensure the performance, quality, and integrity of data pipelines, while implementing data validation, error handling, and performance tuning mechanisms

- Support the migration of legacy systems to modern cloud-based data platforms

- Collaborate with DevOps and platform engineering teams to deploy, monitor, and scale data solutions in production environments

- Ensure compliance with data governance, privacy, and security policies in all aspects of data management

- Document all data flow processes, data dictionaries, and technical specifications

Required Skills and Experience :

- Minimum of 5 years of professional experience as a Data Engineer or in a related role with hands-on data pipeline development

- Strong experience in building and optimizing ETL/ELT pipelines using tools such as Apache Airflow, Talend, or custom scripts

- Expertise in SQL and data modeling with a strong understanding of relational and columnar databases such as PostgreSQL, MySQL, Snowflake, Redshift, or BigQuery

- Experience working with large-scale distributed data processing systems like Apache Spark, Hadoop, or Kafka

- Proficiency in at least one programming or scripting language such as Python, Java, or Scala

- Experience with cloud data platforms and services on AWS, GCP, or Azure, especially using services like S3, Glue, Lambda, Athena, EMR, or BigQuery

- Ability to work independently, prioritize tasks, and manage deadlines in a fast-paced environment

- Strong problem-solving skills and attention to detail, with a mindset for automation and optimization

- Excellent communication and collaboration skills

Preferred Qualifications :

- Bachelors or Masters degree in Computer Science, Engineering, Data Science, or a related field

- Experience working in Agile/Scrum environments

- Exposure to real-time streaming data pipelines is a plus

- Familiarity with version control systems like Git and CI/CD pipelines for data engineering