Overview :

We are seeking a skilled Big Data Engineer to design, implement, and maintain large-scale data processing systems. The ideal candidate will have extensive experience with big data frameworks, distributed computing, and cloud-based data solutions, enabling efficient handling of massive datasets for analytics, AI/ML, and business intelligence.

Key Responsibilities :

- Design, build, and maintain high-performance, scalable big data pipelines for processing structured, semi-structured, and unstructured data.

- Develop and optimize ETL/ELT workflows using big data technologies.

- Work with Hadoop, Spark, Kafka, Flink, or similar frameworks to process and analyze large datasets.

- Integrate data from multiple sources, including relational databases, NoSQL databases, APIs, and streaming platforms.

- Collaborate with data scientists, analysts, and business stakeholders to understand requirements and deliver actionable insights.

- Ensure data quality, consistency, governance, and security across all systems.

- Monitor, troubleshoot, and optimize data pipelines and storage systems.

- Implement best practices for data architecture, performance tuning, and cost optimization in cloud environments.

Required Skills & Qualifications

- 3 to 8 years of experience in big data engineering or data architecture roles.

- Strong programming skills in Python, Java, or Scala.

- Proficiency in Hadoop ecosystem (HDFS, MapReduce), Spark, Hive, Pig, and other big data tools.

- Experience with streaming platforms such as Kafka, Flink, or Spark Streaming.

- Strong SQL skills and experience with relational and NoSQL databases (PostgreSQL, MySQL, Cassandra, MongoDB, HBase).

- Familiarity with cloud platforms and big data services (AWS EMR, GCP Dataproc, Azure HDInsight, Snowflake, Redshift).

- Solid understanding of data modeling, data warehousing, and data lake/lakehouse architectures.

- Excellent problem-solving skills and ability to work in cross-functional teams.

Preferred Skills :

- Experience with real-time analytics and machine learning pipelines.

- Familiarity with CI/CD for data workflows and DevOps for data engineering.

- Understanding of data privacy, security standards, and compliance regulations.