HamburgerMenu
hirist

Site Reliability Team Lead - AWS Cloud Services

Shadow Placements
Chandigarh
7 - 9 Years
star-icon
5white-divider3+ Reviews

Posted on: 04/08/2025

Job Description

Site Reliability Engineer Team Lead | Chandigarh (Onsite) | Permanent

POSITION :

We are looking for an experienced "Site Reliability Engineer Team Lead" to lead an SRE team.

The ideal candidate will have a strong background in enhancing the reliability and scalability of services, leading technical teams, and driving strategic initiatives to improve a Lodging-as-a-Service platform.

RESPONSIBILITIES :

- Leadership & Mentorship : Lead, mentor, and develop a team of SREs, fostering a culture of reliability, collaboration, and continuous improvement.

- Strategic Planning : Drive the design and implementation of scalable, sustainable solutions, and lead the transition towards a cloud-native, serverless, and NoOps environment.

- Service Excellence : Oversee service availability, system performance, and capacity planning for critical.

- Cross-Functional Collaboration : Work closely with stakeholders across the organization to solve complex.

technical challenges and enhance user experiences.

- Incident Management : Lead incident response efforts, perform root cause analysis, and implement preventative measures.

- Process Optimization : Champion the adoption of best practices in monitoring, automation, and observability.

- SLO Management : Define and manage Service Level Objectives (SLOs) to guide prioritization and ensure reliability.

REQUIRED EXPERIENCE :

- Experience : 7+ years in site reliability engineering or related fields, with at least 2 years in a leadership role.

- Education : Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Technical Expertise :

- Extensive experience with AWS cloud services and cloud engineering best practices.

- Proficiency in programming languages such as Java, Python, and familiarity with React.

- Deep understanding of software engineering methodologies and development cycles.

- Expertise in monitoring and observability tools (New Relic, Kibana, Prometheus, Grafana, ElasticSearch).

Leadership Skills : Proven ability to lead technical teams, manage projects, and communicate effectively with stakeholders.

Problem-Solving skills : Exceptional analytical abilities to perform root cause analysis and develop effective solutions.

Automation & Efficiency : Strong background in automating processes and driving operational efficiency.


info-icon

Did you find something suspicious?