Posted on: 18/03/2026
Overview :
We are from the Technology Operations Platform, and our vision is to improve the user experience of business users and the engineering community by making technology operations simpler and more efficient.
Our ultimate goal is to develop a comprehensive suite of tools and solutions that empower Site Reliability Engineering (SRE) teams to seamlessly get started and effectively manage performance and reliability across the organization.
Team/Opportunity :
As a Site Reliability Engineer at Maersk, you will play a critical role in ensuring the reliability, scalability, and performance of our global systems. You will work closely with development and operations teams to automate processes, build resilient infrastructure, and drive continuous improvement. This role demands strong expertise in SRE principles, with a focus on automation and observability. AI/ML knowledge is a valuable plus.
Key Focus Areas :
- Status Page : Emphasis on maintaining a status page for transparency and incident communication.
- Zero-Touch Automation : Highlighting strategies to eliminate manual interventions.
- Middleware & Microservices : Demonstrating architectural know-how for robust service interactions.
- Observability : Utilize observability tools and practices to build scalable SRE and automation solutions for global platforms and users.
- Collaborate effectively with the Observability team to enhance system insights without overlapping responsibilities.
- Operational Excellence : Applying principles to improve reliability and team efficiency.
Key Responsibilities :
- Design, implement, and maintain scalable and reliable infrastructure.
- Develop automations to eliminate manual, redundant toil.
- Collaborate with cross-functional teams to define SLIs, SLOs, and error budgets.
- Monitor system performance and availability using tools like Prometheus and Grafana.
- Conduct root cause analysis and postmortems for incidents.
- Drive adoption of SRE best practices across engineering teams.
- Participate in on-call rotations and proactively prevent incidents.
- Support AI/ML workloads and infrastructure where applicable.
- Demonstrate strong expertise in SRE and automation technologies.
- Act as a problem solver and critical thinker in complex technical scenarios.
Required Qualifications :
- Bachelors or Masters degree in Computer Science, Engineering, or a related field.
- 9+ years of experience in SRE, DevOps, or backend engineering roles.
- Strong programming skills in Python, Ansible.
- Hands-on experience with at least one cloud platform (AWS, GCP, or Azure).
- Proficiency in container orchestration (Kubernetes, Docker).
- Deep understanding of distributed systems and reliability engineering.
- Experience with monitoring, logging, and alerting systems.
- Excellent problem-solving and communication skills
Launched in 2011, our Strategies for Success Programme has helped more than 2,300 women across our global operations to maximise their career potential. Cassia Sanchez is a Product Manager at Maersk, as well as a Strategies for Success alumni. Watch the video to learn more about her experiences on the programme....
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1621685