Posted on: 23/12/2025
Senior Hadoop Administrator
Location : Bangalore (Hybrid) Experience: 810 Years
Role Summary :
The Senior Hadoop Administrator will join the PRE-Big Data team, taking full ownership of the performance, reliability, and architectural integrity of our large-scale distributed data platforms.
This role requires a blend of deep systems administration and modern DevOps engineering to manage diverse open-source ecosystems including Hadoop, Kafka, HBase, and Spark. You will be responsible for moving the platform beyond basic maintenance toward a highly automated, self-healing infrastructure that supports mission-critical data processing and real-time analytics for the organization.
Responsibilities :
- Administer and engineer multi-tenant Hadoop clusters, ensuring high availability and optimal resource allocation across HDFS, YARN, and MapReduce components.
- Design and implement automated deployment and configuration management workflows using Ansible, Shell, or Python to eliminate manual intervention in cluster scaling.
- Execute deep-dive root cause analysis (RCA) for complex production incidents, debugging issues across the full stack from hardware and OS to JVM and application layers.
- Architect and deploy high-availability (HA) solutions for core services to eliminate single points of failure (SPOF) and ensure continuous service delivery.
- Lead capacity planning and proactive hardware expansion initiatives to accommodate data growth and prevent performance degradation during peak processing windows.
- Perform large-scale cluster upgrades and security patching across Hadoop, Kafka, and Spark environments with minimal downtime and zero data loss.
- Develop custom observability frameworks and fine-tune alerting systems to proactively identify hardware failures, network bottlenecks, and resource contention.
- Implement robust cluster hardening techniques, focusing on Kerberos authentication, Ranger/Sentry authorization, and data-at-rest encryption.
- Collaborate with L3 engineering teams to review new use cases and ensure architectural alignment with existing platform standards and performance benchmarks.
- Create and maintain comprehensive Standard Operating Procedures (SOPs) and technical playbooks for incident, problem, and change management workflows.
- Engineer self-healing scripts and automated remediation tools to resolve common platform issues without manual SRE intervention.
- Optimize Spark and Hive execution engines through parameter tuning, memory management, and query plan analysis to meet strict Service Level Agreements (SLAs).
Technical Requirements :
- Expert-level proficiency in Hadoop administration (HDP, CDP, or Vanilla Apache) including HDFS, YARN, Hive, and ZooKeeper.
- Strong hands-on experience in managing distributed streaming and NoSQL platforms such as Apache Kafka and HBase in a production environment.
- Advanced automation skills with significant experience in Ansible for configuration management and Python/Shell for system-level scripting.
- Deep understanding of Linux internals, including kernel tuning, disk I/O optimization, and network troubleshooting for big data workloads.
- Proven ability to code in at least one high-level language, preferably Python, for developing internal tools and data reporting dashboards.
- Mastery of DevOps disciplines, including version control (Git), CI/CD pipelines, and structured change management processes.
- Experience in configuring and managing cluster security using Kerberos, TLS/SSL, and integration with LDAP/Active Directory.
- Competency in monitoring and observability stacks such as Prometheus, Grafana, ELK, or Nagios for distributed systems.
Preferred Skills :
- Experience with containerization of Big Data components using Docker and Kubernetes (K8s) for hybrid cloud deployments.
- Familiarity with cloud-native Big Data services such as Amazon EMR, Google Cloud Dataproc, or Azure HDInsight for migration projects.
- Knowledge of advanced Spark optimization techniques, including Catalyst Optimizer analysis and custom partitioner implementation.
- Prior experience with Infrastructure as Code (IaC) tools like Terraform to provision underlying compute and storage resources.
- Certification in Hadoop Administration (e.g., Cloudera Certified Administrator) or Red Hat Certified Engineer (RHCE).
- Understanding of Data Governance and lineage tools like Apache Atlas to manage metadata and data compliance.
- Experience in implementing tiered storage strategies (Hot/Warm/Cold) within HDFS to optimize infrastructure costs.
Did you find something suspicious?