HamburgerMenu
hirist

Site Reliability Engineer - Splunk/AppDynamics

COGNITUD ADVISORY SERVICES PRIVATE LIMITED
Multiple Locations
2 - 4 Years

Posted on: 23/07/2025

Job Description

We are looking for a passionate Site Reliability Engineer (SRE) with a strong application support background, a developers mindset, and a keen eye for performance and reliability.

You will play a crucial role in enhancing system performance, stability, and observability while automating IT operations and reducing toil.

Key Responsibilities :

- Design and implement SLAs, SLOs, SLIs, and enforce error budgets to improve application reliability.

- Monitor and optimize application performance and infrastructure metrics proactively.

- Configure and maintain observability tools to improve system monitoring, alerting, and logging.

- Analyze system architecture, identify risks, and develop mitigation strategies.

- Collaborate with engineering teams for system design reviews, capacity planning, and performance tuning.

- Conduct blameless postmortems for critical incidents and use learnings to prevent recurrence.

- Provide primary operational support for critical applications and manage incident resolution.

- Develop automated solutions to reduce manual efforts, implement self-healing mechanisms, and enforce resiliency patterns (e.g., circuit breaker, bulkhead).

- Apply analytics to historic incident and usage data to predict and prevent future failures.

Required Skills & Capabilities :

- 23 years of experience in Site Reliability Engineering or Application Support roles.

- Hands-on experience in building dashboards and alerts using Splunk and AppDynamics.

- Solid understanding of microservices architecture and distributed systems.

- Minimum of 2 years of experience developing web-based applications (preferably in Java, Spring Boot).

- Strong understanding of monitoring, observability, and system reliability principles.

- Basic hands-on experience in SQL and database interaction.

- Experience in incident management, root cause analysis, and capacity planning.

Preferred Qualifications :

- Bachelors or Masters degree in Computer Science, Engineering, or a related field (B.Tech / M.Tech).

- Familiarity with DevOps tools, CI/CD pipelines, and cloud infrastructure (AWS, Azure, or GCP) is a plus


info-icon

Did you find something suspicious?