Posted on: 29/10/2025
Description :
- Provide leadership and management to a remote team of Site Reliability Engineers, ensuring alignment with organizational priorities and goals.
- Oversee team operations, including incident management, technical support, and infrastructure maintenance.
- Act as the primary point of escalation for complex technical issues, collaborating with the Director of Systems and Security, Quality Assurance and Product teams as needed.
- Ensure the team adheres to established SLAs for issue resolution and maintains high customer satisfaction levels.
- Mentor and develop team members, fostering growth in technical skills, problem-solving abilities, and customer engagement.
- Lead initiatives to improve operational processes, tools, and workflows, driving greater efficiency and reliability.
- Collaborate with cross-functional teams, including Product, Engineering, and Operations, to address customer needs and improve platform performance.
- Facilitate regular team meetings, performance reviews, and one-on-one sessions to ensure clear communication and ongoing development.
- Maintain and report on key performance metrics, providing insights and recommendations to senior leadership.
- Stay informed on industry trends and best practices, ensuring the team is equipped with the latest tools and methodologies.
- Participate in strategic planning and contribute to the continuous improvement of the SRE function.
Qualifications :
- Proven experience managing technical teams, preferably in Site Reliability Engineering, DevOps, or a related field.
- Strong technical background in cloud computing and infrastructure management, particularly with AWS and Linux-based systems.
- Demonstrated ability to lead and mentor teams in remote and distributed environments.
- Excellent written and oral English communication and interpersonal skills, with the ability to engage effectively with both technical and non-technical stakeholders.
- Strong problem-solving and decision-making abilities, with a focus on root cause analysis and long-term solutions.
- Experience with automation tools (Terraform, Ansible, CloudFormation) and CI/CD pipelines.
- Familiarity with incident management practices and tools, as well as ticketing systems.
- High attention to detail and a commitment to operational excellence.
- Bachelors degree in a technical or quantitative science field, or equivalent work experience.
Preferred Qualifications :
- AWS certification (any level).
- Experience leading customer-facing technical teams, with a focus on improving service delivery.
- Knowledge of security best practices and governance in cloud environments.
- Strong understanding of networking concepts and system architecture.
Key Attributes :
- Empathetic leader who values collaboration, transparency, and accountability.
- Proactive mindset with a focus on continuous improvement and innovation.
- Ability to prioritize and manage multiple initiatives in a fast-paced environment.
- Strategic thinker who can align team efforts with broader organizational objectives.
- Passion for enabling team growth and fostering a culture of learning and development.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1567087
Interview Questions for you
View All