Job Description :

We are seeking a Senior Big Data Engineer with deep expertise in on-premises Big Data platforms, specifically the Cloudera ecosystem, and a strong background in the telecom domain.

The ideal candidate will have 1215 years of experience building and managing large-scale data lakes, developing batch and real-time pipelines, and working with distributed data systems in telecom environments.

While the primary focus is on on-premises Cloudera-based data lakes, familiarity with cloud data services (AWS, Azure, or GCP) is considered an added advantage.

Key Responsibilities :

- Design, build, and maintain on-premises data pipelines using Apache Spark, Hive, and Python on Cloudera.

- Develop real-time data ingestion workflows using Kafka, tailored for telecom datasets (e.g., usage logs, CDRs).

- Manage job orchestration and scheduling using Oozie, with data access enabled via Hue and secured through Ranger.

- Implement and manage data security policies (Kerberos, Ranger) to ensure compliance and controlled access.

- Develop and expose REST APIs for downstream integration and data access.

- Ensure performance tuning, resource optimization, and high availability of the Cloudera platform.

- Collaborate with data architects, engineers, and business stakeholders to deliver end-to-end solutions in a telecom context.

- Support data migration or integration efforts with cloud platforms where applicable is added advantage.

Required Skills & Experience :

- 1215 years of experience in Big Data engineering, with hands-on focus on on-premises data lake environments.

- Extensive Telecom domain knowledge including data models and pipelines related to CDRs, BSS/OSS, customer and network data.

Strong practical experience with :

- Cloudera (CDH/CDP) components : Spark, Hive, HDFS, HBase, Impala

- Kafka : configuration, topic management, producer/consumer setup

- Python for data transformations and automation

- Job orchestration via Oozie

- Access control and metadata management using Ranger

- Proficient in performance tuning, resource management, and security hardening of Big Data platforms.

- Experience with API development and integration for data services.

Optional/Added Advantage :

- Exposure to cloud platforms (AWS EMR, Azure HDInsight, GCP Dataproc), hybrid architecture understanding.

Preferred Qualities :

- Strong problem-solving and troubleshooting skills in distributed data systems.

- Ability to work independently in a fast-paced project environment.

- Effective communicator with both technical and business teams.

- Experience in mentoring junior engineers or leading small technical teams is a plus.