Posted on: 11/07/2025
Key Responsibilities :
L2 Application Support :
- Provide advanced L2 technical support for a portfolio of business-critical applications, serving as a primary escalation point for complex incidents and service requests that cannot be resolved by L1 teams.
Incident Management :
- Lead incident resolution efforts, including diagnosis, troubleshooting, workaround implementation, and communication with stakeholders.
- Drive incidents to resolution within agreed-upon SLAs.
Problem Management :
- Conduct root cause analysis (RCA) for recurring issues, identify long-term solutions, and implement preventative measures to minimize future occurrences.
GCP Environment Expertise :
- Deeply understand and troubleshoot application issues within Google Cloud Platform (GCP) environments.
- This includes familiarity with core GCP services such as Compute Engine, Cloud SQL, Cloud Storage, Cloud Functions, GKE (Google Kubernetes Engine), Cloud Pub/Sub, Cloud Logging, Cloud Monitoring, etc.
Monitoring and Alerting :
- Configure, maintain, and respond to alerts from various monitoring tools (e.g., Google Cloud Operations Suite/Stackdriver, Prometheus, Grafana, Datadog, New Relic) to proactively identify and address potential issues before they impact users.
Performance Analysis :
- Monitor application performance metrics, identify deviations, and collaborate with development and DevOps teams to optimize application performance and resource utilization within GCP.
Change and Release Management :
- Participate in change management processes, review release notes, and provide support for application deployments and upgrades, ensuring minimal disruption.
Documentation and Knowledge Management :
- Create and maintain comprehensive technical documentation, runbooks, FAQs, and knowledge base articles for supported applications and common issues.
- Train L1 support staff as needed.
Stakeholder Communication :
- Effectively communicate with internal teams (development, QA, DevOps, product) and external stakeholders regarding incident status, resolutions, and planned maintenance.
Process Improvement :
- Continuously identify opportunities for process automation, efficiency improvements, and enhanced support methodologies.
Mandatory Skills & Qualifications :
- Extensive hands-on experience with Google Cloud Platform (GCP) services for application hosting, monitoring, and troubleshooting.
- Proven track record of providing L2 (Level 2) technical support for complex enterprise applications.
- Proficiency in using and configuring various monitoring tools (e.g., Google Cloud Operations Suite/Stackdriver, Prometheus, Grafana, Datadog, New Relic, Splunk).
- Strong understanding of ITIL processes (Incident, Problem, Change Management).
- Solid understanding of application architectures (e.g., microservices, monoliths) and relational/NoSQL databases.
- Ability to read and understand application logs, stack traces, and system metrics to diagnose issues.
- Proficiency in scripting languages (e.g., Python, Shell) for automation and troubleshooting is a strong plus.
- Excellent analytical, problem-solving, and debugging skills.
- Exceptional communication skills, both written and verbal, with the ability to explain technical concepts clearly to diverse audiences.
- Ability to work independently, prioritize tasks effectively, and manage multiple incidents concurrently in a fast-paced environment.
- Strong customer service orientation.
Preferred Qualifications :
Did you find something suspicious?