Description :

As a Senior Site Reliability Engineer, you will own and evolve the reliability, scalability, and performance of our global infrastructure.

You'll drive automation, build resilient distributed systems, define SRE maturity, and act as a force multiplier across engineering. This role blends hands-on architecture, operational excellence, and leadership.

Responsibilities :

- Design scalable, distributed systems supporting high availability and near-zero downtime.

- Build observability frameworks across cloud environments.

- Be the point person during outages, lead root-cause analysis, post-mortems and long-term fixes.

- Define and enforce SLA/SLO/SLI frameworks.

- Build IaC and automation pipelines (Terraform, Ansible, Jenkins/GitHub Actions).

- Eliminate manual ops and champion platform-level automation.

- Lead capacity planning, load testing, and system optimisation.

- Improve cloud efficiency and optimise cost without compromising reliability.

- Build secure deployment models, DR strategies, and backup frameworks.

- Ensure compliance with internal and external audit requirements.

- Partner with backend, platform, DevOps, and product teams.

- Mentor junior SREs and elevate reliability culture across engineering.

Requirements :

- 6+ years of experience in SRE/DevOps managing large-scale production systems.

- Expertise in GCP (AWS/Azure acceptable).

- Strong scripting experience (Python / Go / Bash).

- Deep understanding of Docker, Kubernetes and Helm.

- Hands-on with CI/CD: Jenkins, GitHub Actions, ArgoCD.

- Strong exposure to observability tools (Prometheus, Grafana, Datadog, NewRelic).

- Solid fundamentals in networking, load balancing, DNS, and caching.

- Strong automation mindset with expertise in IaC (Terraform, Ansible).

- Excellent debugging, operational rigour, and ownership mindset.

Did you find something suspicious?

Posted by

Akasmat

HR at Magna Hire

Last Active: 10 Dec 2025

Job Views:
5

Applications: 5

Recruiter Actions: 0

Posted in

DevOps / SRE

Functional Area

DevOps / Cloud

Job Code

1587322

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers