Posted on: 11/08/2025
We are seeking a Senior Observability Engineer with strong expertise in Grafana and Python to lead telemetry, monitoring, and automation efforts across our cloud-native infrastructure.
This role is critical in shaping our observability strategy, building real-time dashboards, and automating alerting pipelines to ensure high system availability and performance.
Key Responsibilities :
- Design, develop, and maintain Grafana dashboards for real-time infrastructure and application monitoring.
- Build and enhance Python-based automation tools for telemetry data processing, health checks, and alerts.
- Integrate observability solutions with Azure Monitor, Log Analytics, Prometheus, and OpenTelemetry.
- Define and implement SLIs, SLOs, and proactive alerting mechanisms.
- Collaborate with SREs, DevOps, and developers to improve monitoring coverage and incident response.
- Contribute to infrastructure automation and CI/CD workflows using Python, Git, and DevOps tools.
- Lead tool selection, observability best practices, and adoption across engineering teams.
Requirements :
- 5+ years of experience in observability, DevOps, or SRE roles.
- Strong hands-on experience with Grafana, including templating, alerting, and data source integration.
- Proficient in Python scripting for automation and data processing.
- Experience with Prometheus, Azure Monitor, Log Analytics, and Kubernetes.
- Familiarity with distributed systems, tracing, and telemetry pipelines.
- Exposure to tools like Loki, OpenTelemetry, ArgoCD, or Terraform is a plus.
Nice to Have :
- Experience with CI/CD pipelines (Jenkins, Azure DevOps, GitHub Actions).
- Knowledge of containerized environments (Docker, Kubernetes, AKS).
- Ability to design cost-efficient monitoring solutions and dashboards.
Benefits :
- Fun, happy and politics-free work culture built on the principles of lean and self-organisation;
- Work with large scale systems powering global businesses;
- Competitive salary and benefits.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1528204
Interview Questions for you
View All