Posted on: 29/10/2025
Job Description :
We are seeking a highly skilled Principal Site Reliability Engineer to join our team.
The ideal candidate will have a Bachelors or Masters degree in computer science, Information Technology, or a related field (or equivalent experience) with 15+ years of experience in DevOps, Infrastructure, or Site Reliability Engineering roles.
Additionally, the candidate should have 4+ years in a senior or principal-level capacity driving SRE or reliability automation initiatives and a proven track record designing and scaling large distributed, cloud-native platforms.
Telecom domain experience is good to have.
Skills :
- Deep expertise in AWS (EKS, EC2, RDS, IAM, VPC, Kafka, CloudWatch, API GW, Lambda, WAF, KMS) and container orchestration (EKS).
- Deep expertise in HelmChart.
- Hands-on experience with APM tools (Elastic APM preferred).
- Expert in Terraform, Jenkins, Bitbucket, and Python/Bash/Go scripting for automation.
- Strong understanding of SLO/SLI frameworks, error budgets, and observability design.
- Familiarity with AIOps, chaos engineering, and event-driven automation.
- Proven experience in performance optimization, capacity planning, and resilience testing.
- Excellent documentation and system design communication skills.
Accreditation/certifications/licenses :
- AWS Certified Solutions Architect Professional or DevOps Engineer Professional.
- Certified Kubernetes Administrator (CKA) or Kubernetes Application Developer (CKAD).
Preferred :
- SRE Foundation / Google SRE / Dynatrace Performance Professional / Elastic Certified Engineer.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1566360
Interview Questions for you
View All