HamburgerMenu
hirist

Kubernetes Platform Engineer - Gardener

CROSSDEV TECHNOLOGIES PRIVATE LIMITED
7 - 12 Years
Remote

Posted on: 30/04/2026

Job Description

Job Description :


We are looking for an experienced Kubernetes Platform Engineer with strong expertise in Gardener to manage and support large-scale Kubernetes environments. This role involves troubleshooting complex cluster issues, optimizing platform configurations, and ensuring high availability and performance across distributed systems.


Key Responsibilities :


- Diagnose and resolve issues across Gardener control planes and managed clusters


- Handle incidents related to provisioning, scaling, and cluster upgrades


- Perform deep root cause analysis and document findings for critical incidents


- Manage lifecycle of Kubernetes clusters including deployment, upgrades, and maintenance


- Work with shoot and seed clusters within the Gardener ecosystem


- Ensure cluster stability, performance, and scalability


- Review and improve platform configurations for better efficiency and reliability


- Identify bottlenecks and implement performance improvements


- Support integration between Gardener components, underlying OS (Garden Linux), and virtualization layers (KVM-based environments)


- Collaborate with infrastructure teams to maintain seamless platform operations


- Use monitoring tools to track system health and performance


- Analyze logs and metrics to proactively identify issues


- Prepare detailed incident reports and root cause analysis (RCA) documents


- Create best practice guidelines and operational documentation


- Conduct knowledge transfer sessions within the team


Required Skills & Expertise :


- Strong understanding of Kubernetes internals (control plane, scheduling, networking)


- Hands-on experience managing production-grade Kubernetes clusters


- Experience working with Gardener architecture


- Knowledge of shoot and seed cluster operations


- Familiarity with cluster lifecycle management and troubleshooting


- Experience with monitoring tools such as Prometheus and Perses


- Ability to analyze metrics and logs for troubleshooting


- Understanding of Linux-based systems


- Experience with virtualization technologies (KVM preferred)


- Strong debugging and analytical skills


- Experience conducting root cause analysis and post-incident reviews


Key Deliverables :


- Detailed incident resolution reports


- Root Cause Analysis (RCA) documentation for major issues


- Configuration optimization recommendations


- Best practices and operational documentation


Candidate Profile :


- Strong ownership and accountability in handling critical incidents


- Ability to work in distributed/remote environments


- Good communication and collaboration skills


- Proactive approach to system stability and performance

info-icon

Did you find something suspicious?

Similar jobs that you might be interested in