Posted on: 13/01/2026
Description :
- Architect auto-scaling, high-availability (HA), and disaster recovery (DR) systems with defined RPO/RTO.
- Design multi-region network architectures (VPCs, subnets, peering, NAT, trust zones).
- Implement Infrastructure as Code (IaC) for automated provisioning and management.
- Set up, manage, and monitor large-scale Kubernetes clusters.
- Manage persistent volumes, networking, and Kubernetes network policies at scale.
- Ensure best practices for reliability, security, and performance.
- Design and maintain CI/CD pipelines to improve developer productivity.
- Track and improve DevOps metrics (e. g., DORA).
- Implement observability and monitoring for system health and performance.
- Troubleshoot critical production issues and drive long-term fixes.
- Integrate security practices (SAST, DAST, etc. ) into CI/CD workflows.
- Design systems aligned with compliance standards, manage audits and renewals.
- Work closely with engineering teams to align DevOps strategy with product goals.
- Provide technical guidance and best practices for deployment and scaling.
Requirements :
- 8+ years of experience in DevOps / Cloud / Infrastructure roles.
- Strong hands-on expertise across AWS, GCP, and Azure.
- Proven experience designing HA, DR, and large-scale distributed systems.
- Deep understanding of Kubernetes, CI/CD, and cloud networking.
- Strong troubleshooting skills with a root-cause analysis mindset.
- Experience in AI, SaaS, or mid-stage startups is a strong plus.
- Ability to lead by example in hands-on environments.
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1600453