HamburgerMenu
hirist

Job Description

Description :

Location : Pune.

Job Title : NOC Engineer (56 Years Experience).

Responsibilities :


- Monitor production systems, applications, and infrastructure for availability, performance, and stability.

- Respond promptly to incidents, outages, and alerts; perform root cause analysis and drive resolution.

- Work closely with SRE/DevOps teams to improve monitoring, alerting, and system reliability.

- Create and maintain detailed runbooks, incident documentation, and SOPs.

- Troubleshoot and resolve complex infrastructure and application-level issues across distributed systems.

- Perform log analysis, basic scripting, and system diagnostics to support issue resolution.

- Participate in 247 rotational shift and on-call support as required.

- Ensure SLAs and SLOs are met and help reduce MTTR (Mean Time to Resolution).

- Maintain dashboards and observability tools to monitor key health indicators.

- Contribute to automation efforts for incident response, routine checks, and reporting.

- Collaborate with engineering and platform teams to support deployments, upgrades, and maintenance tasks.

Required Qualifications :


- 5 - 6 years of experience in NOC, SRE, or DevOps roles in production environments.

- Solid understanding of monitoring and observability tools such as New Relic, Datadog, Grafana, Prometheus, or ELK.

- Experience with cloud platforms (AWS, Azure, or GCP).

- Familiarity with container orchestration tools (Docker, Kubernetes) and infrastructure as code (Terraform).

- Strong troubleshooting skills for network, infrastructure, and application issues.

- Basic scripting experience in Bash, Python, or similar for automation tasks.

- Understanding of CI/CD pipelines and DevOps best practices.

- Good communication skills and ability to work in high-pressure environments.

- Willingness to work in a 247 support rotation and take ownership of issues.

Primary Skills :


- Monitoring & Observability (New Relic, Datadog, Grafana, ELK).

- Infrastructure & Troubleshooting (Linux, Networking, Log Analysis).

- Automation & Scripting (Bash, Python).

- Cloud Platforms (AWS/Azure/GCP).

- Terraform (basic to intermediate).


info-icon

Did you find something suspicious?