Posted on: 04/02/2026
Key Responsibilities :
- Oversee the management and configuration of Dynatrace Managed for enterprise applications.
- Configure and optimize dashboards, synthetic monitors, business transactions, and key performance indicators (KPIs) for applications and infrastructure.
- Define, tune, and optimize problem detection rules, thresholds, and event correlation to reduce noise and improve alert quality.
- Develop custom metrics, health checks, and other monitoring solutions, ensuring full application and infrastructure coverage.
- Perform OneAgent and Cluster Node upgrades, ensuring minimal disruption and system stability.
- Onboard OneAgent for OpenShift and Kubernetes workloads based on monitoring requirements.
- Set up and manage ActiveGates for custom plugins, extensions, and synthetic monitoring.
- Implement integrations with ticketing, notification, and automation tools; build API-based automation for configuration management.
- Set up Real User Monitoring (RUM) for applications to gather performance insights.
- Troubleshoot complex issues, perform log analysis, and drive incidents to resolution.
- Collaborate with development, platform, and operations teams to integrate monitoring across tech stacks.
- Lead continuous improvement initiatives, such as reducing alert noise, enhancing dashboards, and expanding coverage.
- Manage incidents, service requests, change requests, and problem records in adherence to ITIL best practices.
- Handle SSL certificate renewals, port checks, and firewall configurations for security compliance.
- Engage with Dynatrace Support for issue escalation and ticket management.
- Mentor junior engineers, sharing Dynatrace best practices and facilitating knowledge transfer.
Required Qualifications :
- 6 to 8 years of hands-on experience with Dynatrace Managed in large-scale, enterprise environments.
- Extensive experience with OneAgent, ActiveGates, synthetic monitoring, and Real User Monitoring (RUM) configurations.
- Deep knowledge of application performance monitoring (APM), metrics gathering, and alert optimization.
- Proficiency with cloud environments (AWS, GCP) and container orchestration platforms (OpenShift, Kubernetes).
- Advanced understanding of monitoring best practices, performance tuning, and troubleshooting techniques.
- Strong Linux/Unix command-line skills for troubleshooting and scripting.
- Familiarity with networking concepts, including firewalls, SSL certificates, and port management.
- Experience with automation (e.g., Shell scripting, Python) and API integrations.
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1609855