We are looking for a skilled Data Engineer (Big Data) to join our data platform team. In this role, you will design, build, and maintain scalable data pipelines and infrastructure that support analytics, reporting, and machine learning workloads. Youll work closely with data scientists, analysts, and backend teams to ensure reliable, high-quality data delivery across the organization.

Key Responsibilities :

- Design, develop, and maintain scalable data pipelines and ETL/ELT processes for structured and unstructured data.

- Build and optimize data lake and data warehouse architectures on cloud or on-premise environments.

- Work with large-scale Big Data frameworks (Hadoop, Spark, Kafka, Flink, etc.) to process and analyze high-volume datasets.

- Implement data ingestion, transformation, and storage solutions using modern tools and cloud services.

- Collaborate with data analysts and scientists to understand data requirements and design efficient data models.

- Ensure data quality, consistency, and security across all systems.

- Optimize data workflows for performance, scalability, and cost efficiency.

- Implement monitoring, logging, and alerting for data pipeline reliability.

- Contribute to data governance, metadata management, and documentation practices.

Required Skills and Qualifications :

- Bachelors or Masters degree in Computer Science, Information Systems, or related field.

- 3+ years of hands-on experience in data engineering or Big Data development.

- Strong proficiency in programming languages such as Python, Scala, or Java.

- Hands-on experience with Big Data tools - Hadoop, Spark, Hive, Kafka, or Flink.

- Expertise in ETL/ELT development and workflow orchestration tools (e.g., Airflow, NiFi, Luigi).

- Strong knowledge of SQL and experience with relational databases (PostgreSQL, MySQL) and NoSQL systems (MongoDB, Cassandra, HBase).

- Experience with cloud-based data platforms such as AWS (EMR, Glue, Redshift), Azure (Synapse, Data Factory), or GCP (BigQuery, Dataflow).

- Familiarity with data lake and data warehouse design principles.

- Solid understanding of data modeling, partitioning, and schema evolution.

- Proficiency with version control (Git) and CI/CD practices for data pipelines.

Preferred Qualifications:

- Experience with Delta Lake, Apache Iceberg, or Snowflake.

- Knowledge of containerization and orchestration (Docker, Kubernetes).

- Familiarity with machine learning pipelines and data science workflows.

- Understanding of data security, compliance (GDPR, HIPAA), and governance principles.

- Strong problem-solving skills and ability to work in agile, fast-paced environments.