HamburgerMenu
hirist

Job Description

Key Responsibilities :

- Infrastructure Leadership : Lead the architecture, deployment, and operation of scalable, secure, and highly available cloud-native platforms.

- Kubernetes Expertise : Serve as the subject matter expert for Kubernetes, managing both control plane and data plane components across on-premises and public cloud environments.

- Automation & IaC : Drive Infrastructure as Code (IaC) initiatives using Terraform to manage infrastructure end-to-end, coupled with extensive automation scripting using Python.

- DevOps/SRE : Implement and champion CI/CD pipelines (GitOps methodologies preferred) and robust SRE practices for system reliability, performance, and monitoring.

- Monitoring & Observability : Configure and manage comprehensive monitoring and logging solutions using tools like Prometheus, Grafana, ELK Stack, and Fluentbit.

- Networking & Security : Ensure robust networking, storage, and security configurations across both on-premises and cloud environments, focusing on resilience and compliance.

- Service Mesh & APIs : Deploy and manage service mesh solutions (Istio/Consul) and implement API Gateways (Kong preferred) to manage microservices traffic.

- Event-Driven Systems : Work with Kafka and other technologies to support highly available, event-driven microservices architectures.

- Mentorship : Provide technical guidance and mentorship to junior team members, fostering a culture of operational excellence and continuous improvement.

Required Skills and Experience (Must Have) :

- Total Experience : 10+ years of progressive experience in IT infrastructure, system administration, and cloud engineering.

- Foundational Skills : 10+ years of hands-on experience with Linux/Unix operating systems, major public clouds (AWS, GCP, or Azure), DevOps practices, and Containers.

- Kubernetes Depth : 5+ years of strong, practical experience with Kubernetes, including deep understanding and troubleshooting of the control plane and data plane (on-premise and cloud deployments).

- IaC & Automation : Expert-level proficiency with Terraform for infrastructure provisioning and extensive automation using Python and Shell scripting.

- Networking & Systems : Strong understanding of core networking, storage, and security concepts in complex, distributed environments (on-prem and cloud).

- Cloud-Native Tools : Expertise in implementing and managing Istio / Consul service mesh and packaging applications using Helm charts.

- Observability Stack : Hands-on experience with Prometheus, Grafana, ELK Stack, and Fluentbit for Kubernetes monitoring and logging.

- Microservices Backbone : Experience with API Gateways (Kong preferred), Kafka, and designing event-driven microservices.

- Modern Practices : Exposure to GitOps, CI/CD, and SRE methodologies.

- Communication : Understanding of REST and gRPC communication protocols.

Desirable Skills (Good to Have) :

- Experience managing and troubleshooting distributed / multi-region Kubernetes clusters.

- Familiarity with Tanzu and VMware virtualization technologies.

- Knowledge of container security best practices (image scanning, pod security policies, network policies).

- Advanced concepts in K8s networking/firewall and storage (e.g., CNI, CSI, PV/PVC).

- Proven experience in troubleshooting complex production infrastructure issues under pressure.


info-icon

Did you find something suspicious?