We are looking for a highly experienced Senior Staff Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will bring deep technical expertise in DevOps, automation, and large-scale distributed systems, with a strong understanding of cloud operations and CI/CD frameworks. Experience in the telecom domain will be an added advantage.

Key Responsibilities :

- Design, build, and maintain scalable, reliable, and secure cloud infrastructure.

- Develop automation solutions to improve system performance, availability, and operational efficiency.

- Implement, manage, and optimize CI/CD pipelines and deployment strategies.

- Monitor, troubleshoot, and resolve complex production issues across distributed systems.

- Collaborate closely with development and operations teams to drive site reliability best practices.

- Develop observability frameworks and tools to ensure visibility into system health and performance.

- Lead incident management, post-mortems, and continuous improvement initiatives.

- Mentor junior engineers and promote a culture of automation and reliability engineering.

Requirements :

- Bachelors degree in Computer Science, Information Technology, or a related field (or equivalent practical experience).

- 8+ years of experience in DevOps or Site Reliability Engineering (SRE) roles.

- Proven expertise in managing and scaling large distributed systems.

- Strong background in cloud operations (AWS preferred), automation, and CI/CD.

Technical Skills :

- Cloud: AWS (EKS, EC2, RDS, IAM, VPC, Kafka, CloudWatch, API Gateway, Lambda, WAF, KMS).

- Infrastructure as Code: Terraform, Jenkins, Git.

- Scripting: Python, Bash.

- Monitoring & Observability: Grafana, Elastic Stack, Prometheus.

- Containerization & Orchestration: Kubernetes, Docker, microservices reliability.

- Strong understanding of Linux administration and networking fundamentals.

Preferred Certifications :

- AWS Certified DevOps Engineer / Solutions Architect Associate (preferred).

- Terraform Associate or Kubernetes Certified Administrator (CKA) (a plus).

- SRE Foundation or Google SRE Certification (desirable).

Why Join Us :

- Work on cutting-edge distributed systems and cloud infrastructure.

- Opportunity to lead high-impact initiatives in a fast-paced, technology-driven environment.

- Collaborate with cross-functional teams in a culture that values innovation, ownership, and growth.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Shruti

HR Generalist at Movius

Last Active: 30 Dec 2025

Job Views:
52

Applications: 23

Recruiter Actions: 0

Posted in

DevOps / SRE

Functional Area

Site Reliability Engineering

Job Code

1566339

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers