Posted on: 11/07/2025
Job Summary :
Key Responsibilities :
Reliability & Performance :
- Drive root cause analysis (RCA) and implement long-term solutions to prevent recurrence of incidents.
- Manage capacity planning, scalability, and performance tuning across cloud and on-prem environments.
- Lead and participate in the on-call rotation, providing timely support and issue resolution.
DevOps Automation & CI/CD :
- Design, implement, and maintain CI/CD pipelines using Jenkins, GitHub, and other DevOps tools.
- Enhance automation for routine operational tasks, incident response, and self-healing capabilities.
Monitoring & Observability :
- Implement and manage enterprise monitoring solutions including Splunk, Dynatrace, Prometheus, and Grafana.
- Continuously improve observability, logging, and tracing across all environments.
Cloud Platforms & Infrastructure :
- Work with AWS, Azure, and PCF (Pivotal Cloud Foundry) environments, managing cloud-native services and infrastructure.
- Collaborate with cloud security and networking teams to ensure secure and compliant infrastructure.
Payment Systems Expertise :
- Apply your understanding of Card Payment systems to ensure platform reliability and compliance.
- Collaborate with product and development teams to ensure alignment with business objectives
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1511834
Interview Questions for you
View All