HamburgerMenu
hirist

Job Description

Description :


We are seeking skilled SRE / Production Support Engineers to manage large-scale production environments. The role involves application and batch monitoring, incident resolution, infrastructure troubleshooting, automation using scripting, and CI/CD-based deployments. You will collaborate with cross-functional teams to ensure system reliability, performance, and scalability.

Key Responsibilities :


- Perform Application, Batch, and Infrastructure (Infra/DB/Network) monitoring and troubleshooting.


- Handle incident management, root cause analysis (RCA), and resolution within defined SLAs.


- Use Splunk, APM, and Grafana for proactive monitoring, alert analysis, and system performance tracking.


- Participate in production deployments through CI/CD pipelines (Jenkins, Git, or equivalent).


- Develop and maintain automation scripts using Shell scripting (and optionally Python/PowerShell).


- Work closely with development, DevOps, and cloud teams to maintain reliability and uptime.


- Create and update runbooks, dashboards, and monitoring alerts.


- Ensure high system availability and implement preventive measures for recurring issues.

Technical Skills Required :


- Monitoring & Logging : Splunk, Grafana, AppDynamics / Dynatrace / New Relic (APM Tools)


- Infrastructure Troubleshooting : Application, Database (DB), and Network (NW) layers


- Scripting : Shell scripting (mandatory), Python (preferred)


- CI/CD & Deployment : Jenkins, Git, or other CI/CD tools


- Cloud (for Hyderabad role) : AWS / Azure / GCP (any one)


- OS & Infra Knowledge : Linux/Unix environments, strong Infra understanding

Preferred Qualifications :


- Bachelors degree in Computer Science, IT, or related field


- 5-8 years of experience in SRE / Production Support roles


- Strong analytical and troubleshooting skills


- Excellent communication and collaboration abilities


- Willingness to work in 24/7 rotational shifts.


info-icon

Did you find something suspicious?