Posted on: 20/11/2025
Description :
Job Title : Senior Site Reliability Engineer (SRE) Datadog Observability.
Experience Required : 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in Datadog.
Location : Hyderabad preferable but open for Pune and remote.
Job Summary :
The ideal candidate will bring deep technical expertise in building reliable, scalable, and observable systems, with hands-on experience in integrating enterprise applications and middleware.
Key Responsibilities :
- Design, configure, and manage Datadog dashboards, monitors, alerts, and APM for proactive issue detection and resolution.
- Utilize the Datadog Roles API to create and manage user roles, global permissions, and access controls for various teams.
- Collaborate with product managers, engineering teams, and business stakeholders to identify
observability gaps and design solutions using Datadog.
- Implement automation for alerting, incident response, and ticket creation to improve operational efficiency.
- Work closely with business and IT teams to support critical Financial Month-End, Quarter-End, and Year-End closures.
- Leverage Datadog AI.
- Provide technical leadership in observability, reliability, and performance engineering practices.
Required Skills And Experience :
- Minimum 3+ years of hands-on experience with Datadog (dashboards, APM, alerting, log
management, Roles API, and monitoring setup).
- Proven experience implementing SRE best practices - incident management, postmortems,
automation, and reliability metrics.
- Excellent stakeholder management and communication skills; experience collaborating with
business and IT teams.
- Strong problem-solving mindset and ability to work in high-pressure production support
environments.
Preferred Qualifications :
- Certification in Datadog or related observability platforms.
- Experience in cloud platforms (AWS, Azure, or OCI).
- Exposure to ITIL-based production support processes.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1578090
Interview Questions for you
View All