Posted on: 24/10/2025
Description :
- Guide the team in handling critical situations and high-severity incidents.
- Manage and resolve L2/L3 technical issues, ensuring minimal downtime and quick resolution.
- Handle customer escalations and coordinate with internal stakeholders for timely resolution.
- Escalate critical issues through the appropriate hierarchical channels.
- Collaborate with cross-functional teams including development, QA, and infrastructure.
- Monitor systems using industry-standard tools and respond to alerts proactively.
- Guide the team in performing production deployments, patching, and release management.
- Maintain and troubleshoot On-premise server infrastructure.
- Ensure adherence to operational processes and documentation standards.
- Participate in 24x7 shift rotations and provide on-call support as needed.
Required Skills & Qualifications :
- Proven experience in leading and managing technical teams.
- Strong Linux administration skills (command-line, scripting, troubleshooting).
- Solid understanding of networking fundamentals (beyond basic level).
- Hands-on experience with monitoring tools (e.g., Nagios, Zabbix, Prometheus, etc.).
- Experience in incident management and root cause analysis.
- Familiarity with ITIL processes and escalation management.
- Experience in production deployment and release management.
- Working knowledge of Hadoop and distributed systems.
- Basic knowledge of Docker and Ansible.
- Excellent communication and coordination skills.
- Ability to work under pressure in a fast-paced environment.
Preferred Qualifications :
- Experience with cloud platforms (AWS, Azure, GCP) is a plus.
- Automation/scripting experience (Shell, Python, Ansible) is an advantage.
Did you find something suspicious?