Posted on: 13/10/2025
Key Responsibilities :
- Design, develop, and maintain monitoring and alerting systems using Grafana, Prometheus, and AWS CloudWatch.
- Create and optimize dashboards to provide actionable insights into system and application performance.
- Collaborate with development and operations teams to ensure high availability and reliability
of services.
- Proactively identify performance bottlenecks and drive improvements.
- Continuously explore and adopt new monitoring/observability tools and best practices.
Required Skills & Qualifications :
- Minimum 2 years of experience in SRE, DevOps, or related roles.
- Hands-on expertise in Grafana, Prometheus, and AWS CloudWatch.
- Proven experience in dashboard creation, visualization, and alerting setup.
- Strong understanding of system monitoring, logging, and metrics collection.
- Excellent problem-solving and troubleshooting skills.
- Quick learner with a proactive attitude and adaptability to new technologies.
Good to Have (Optional) :
- Experience with AWS services beyond CloudWatch.
- Familiarity with containerization (Docker, Kubernetes) and CI/CD pipelines.
- Scripting knowledge (Python, Bash, or similar).
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1559922
Interview Questions for you
View All