Posted on: 13/10/2025
Description :
Qualifications :
- Proven experience as a Site Reliability Engineer, Sr DevOps Engineer, or similar role.
- 5 to 7 years of Relevant experience, at least 2 years of experience in Microsoft Azure. Good to have AWS and GCP.
- Experience in setting up and managing OTEL, using Loki, Tempo, Promotus, Grafana, Alloy etc.
- Experience in creating CI/CD pipelines using Azure DevOps, Jenkins, Spinnaker, Terraform, Ansible, Docker, Kubernetes etc.
Key Responsibilities :
Monitoring and Incident Response :
- Proactively monitor system performance and availability using OTEL.
- Manage incidents and troubleshoot issues in real-time.
- Implement and improve incident management processes.
Automation and Efficiency :
- Develop and maintain automation scripts and tools to enhance operational efficiency.
- Automate repetitive tasks to reduce manual interventions.
- Ensure continuous integration and delivery (CI/CD) pipelines are robust and efficient.
Performance and Capacity Management :
- Conduct performance tuning, optimization, and capacity planning.
- Perform root cause analysis and post-mortem discussions for incidents.
- Implement solutions to improve system reliability and performance.
Collaboration and Communication :
- Work closely with development teams to ensure systems are designed with reliability and scalability in mind.
- Communicate effectively with stakeholders to provide updates and insights on system health.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1560266
Interview Questions for you
View All