HamburgerMenu
hirist

Job Description

Job Summary :


We are looking for a NOC Support Engineer (Software) to monitor, support, and troubleshoot enterprise-grade software applications and platforms in a 24x7 Network Operations Center (NOC) environment. The ideal candidate will be responsible for ensuring high availability, performance, and reliability of applications and systems while proactively identifying and resolving issues before they impact customers. This role requires strong technical troubleshooting skills, experience with monitoring tools, and the ability to work in shift-based operations supporting mission-critical systems.


Key Responsibilities :


Monitoring & Incident Management :


- Monitor application health, system performance, and service availability using NOC monitoring tools


- Perform real-time incident detection, logging, triage, and resolution


- Respond to alerts, alarms, and service disruptions within defined SLA timelines


- Escalate unresolved issues to L2/L3 engineering teams with detailed analysis and documentation


- Coordinate with cross-functional teams during major incidents and outages


- Provide L1/L2 support for software applications, platforms, and services


- Troubleshoot issues related to application performance, integrations, APIs, and services Analyze logs, metrics, and dashboards to identify root causes


- Perform basic configuration changes and service restarts as per SOPs


- Validate fixes and monitor post-incident stability


Operations & Maintenance :


- Execute routine operational tasks such as health checks, backups verification, and batch job monitoring


- Maintain and update runbooks, SOPs, and knowledge base articles


- Support deployments, patches, and scheduled maintenance activities


- Ensure compliance with operational processes and security standards


Reporting & Continuous Improvement :


- Prepare shift handover notes, incident reports, and daily operational summaries


- Track recurring issues and recommend automation or process improvements


- Contribute to reducing MTTR, downtime, and false alerts Participate in post-incident reviews and root cause analysis (RCA)


Required Skills & Qualifications Technical Skills :


- 1- 4 years of experience in NOC / Application Support / Software Operations


- Strong understanding of software applications, system architecture, and data flows


- Experience with monitoring tools (e.g., Nagios, Zabbix, Grafana, Prometheus, AppDynamics, New Relic, Splunk)


- Basic knowledge of Linux/Unix systems and command-line troubleshooting


- Experience analyzing application logs, metrics, and alerts


- Understanding of incident management and ITIL processes


- Basic scripting knowledge in Python, Shell, or PowerShell


- Familiarity with automation and alerting improvements


Cloud & Infrastructure (Good to Have) :


- Exposure to AWS, GCP, or Azure


- Basic understanding of containers (Docker) and orchestration (Kubernetes) Networking fundamentals (DNS, TCP/IP, HTTP/HTTPS)


Education : Bachelors degree in Computer Science, Information Technology, Electronics, or a related field

info-icon

Did you find something suspicious?

Similar jobs that you might be interested in