Posted on: 22/12/2025
Job Responsibilities :
- Engineer comprehensive APM solutions and establish clear standards for APM and agent deployments across all environments.
- Define best practices for APM configuration, optimizations, and tunings based on specific application and business requirements.
- Drive the adoption and effective usage of APM tools within Cloud IaaS (Infrastructure-as-a-Service) and PaaS (Platform-as-a-Service) models.
- Devise a robust authentication and authorization model for our APM platform, aligning with the client's standard security framework.
- Develop necessary customizations and automation scripts to enforce security policies and streamline user access management within the APM environment.
- Architect a universally available and highly scalable APM infrastructure with appropriate built-in monitoring and alerting mechanisms to ensure the health and performance of the APM platform itself.
- Proactively seek and deploy opportunities for automated deployments of APM agents and configurations to significantly reduce manual operational tasks and improve consistency.
- Identify and implement opportunities for seamless integration of the APM platform with other existing monitoring tools within the client's portfolio, creating a unified observability landscape.
- Thoroughly test and implement APM Monitoring Extensions that are appropriate for the client's diverse technology stack, including web servers (i.e., Apache, Tomcat) and messaging tiers (Kafka, RabbitMQ).
- Expertly create, rigorously test, and implement APM Agents for comprehensive application monitoring of both Java and DotNet-based applications.
- Provide leadership and guidance in performance troubleshooting efforts, leveraging APM data to identify root causes and recommend effective remediation strategies.
- Implement and optimize End User Experience Monitoring solutions within the APM framework to gain critical insights into application responsiveness and user satisfaction.
- Proficiently write and execute Ansible playbooks (Run/Code Ansible) for infrastructure automation and configuration management.
- Develop and maintain Azure DevOps (AZDO) pipelines for automated deployment and management of APM components.
- Possess strong working knowledge of various operating systems, including Linux, Windows, and AIX, and their impact on application performance and monitoring.
- Utilize Git for effective version control of configurations, scripts, and other APM-related artifacts.
- Leverage advanced Python coding skills for developing custom monitoring scripts, automation tasks, and data analysis.
- Demonstrate strong communication skills to effectively articulate technical concepts and collaborate with cross-functional teams.
- Understand and apply SAFe DevOps Scaled Agile Methodology principles within the context of APM implementation and management.
- Possess knowledge and practical usage of containerization platforms like OpenShift (Kubernetes) and their impact on application monitoring.
- Demonstrate strong knowledge and practical usage of various observability tools beyond traditional APM, such as Elastic (ELK Stack), Grafana, Prometheus, and OpenTelemetry (OTEL).
Education & Experience :
- 7+ years of total experience in IT.
- At least 4+ years of dedicated experience in Application Performance Monitoring (APM) tools, agent deployments, and implementation of APM Agents specifically for Java and DotNet-based applications.
- Proven experience with Elastic Monitoring (Elasticsearch, Kibana, Beats, Logstash).
- Demonstrated experience in performance troubleshooting and root cause analysis using APM data.
- Hands-on experience with End User Experience Monitoring (EUEM) tools and techniques.
- Practical experience with Apache and Tomcat web server monitoring.
- Proficient in writing and executing Ansible playbooks for automation.
- Experience in developing and maintaining Azure DevOps (AZDO) pipelines.
- Strong operating system knowledge (Linux, Windows, AIX).
- Proficient in using Git for version control.
- Advanced Python coding skills for automation and scripting.
- Excellent written and verbal communication skills.
- Familiarity with SAFe DevOps Scaled Agile Methodology.
- Experience with OpenShift (Kubernetes) and containerized application monitoring.
- Solid understanding and practical usage of Observability tools (Elastic, Grafana, Prometheus, OTEL)
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1593698
Interview Questions for you
View All