Posted on: 23/01/2026
Description :
Job Title : Manager Site Reliability Engineer
Experience : 12+ Years
About Ensono :
Ensono is a trusted technology adviser and managed service provider, helping organizations accelerate their digital transformation across hybrid environments. With deep expertise in cloud platforms, application modernization, DevOps, and managed services, Ensono partners with clients to deliver resilient, secure, and scalable technology solutions. Headquartered in the greater Chicago area, Ensono has 3,500+ associates globally and is recognized for its excellence across AWS, Azure, and Google Cloud ecosystems.
At Ensono, we believe in being a relentless allysupporting our clients with 24/7 services, cross-platform expertise, and outcome-driven solutions that enable continuous innovation.
Role Summary :
The Manager Site Reliability Engineer (SRE) will lead the design, implementation, and continuous improvement of reliability, scalability, and operational excellence across mission-critical systems. This role combines deep technical expertise with leadership responsibilities, ensuring robust service delivery while driving automation, incident management excellence, and proactive risk mitigation for multiple clients.
Key Responsibilities
Site Reliability & Operations :
- Lead the reliability, availability, and performance of cloud and hybrid systems across client environments.
- Identify systemic issues through incidents and failures, and drive long-term corrective actions.
- Implement fixes, enhancements, and architectural improvements to improve service resilience.
- Proactively reduce operational toil through automation and process optimization.
Incident & Problem Management :
- Provide leadership during major incident resolution, ensuring timely communication and effective remediation.
- Own post-incident reviews (post-mortems), root cause analysis, and mitigation planning.
- Create, maintain, and continuously improve operational documentation and runbooks.
Automation & DevOps :
- Design, implement, and refine automation for incident resolution and service request fulfillment.
- Drive Infrastructure as Code (IaC) adoption using tools such as Terraform (preferred) or ARM/Bicep.
- Collaborate with teams to enhance CI/CD pipelines using Azure DevOps, GitHub Actions, GitLab, or Harness.
Monitoring & Observibility :
- Implement and optimize monitoring and observability solutions using tools such as Datadog, Splunk, New Relic, and Azure Monitor.
- Define meaningful SLIs, SLOs, and alerts to ensure proactive system health management.
Security & Risk Management :
- Develop and enhance processes for proactive security management, including vulnerabilities in code, infrastructure, and dependencies.
- Work closely with stakeholders to mitigate risks and improve overall system security posture.
Stakeholder & Client Engagement :
- Lead client-facing discussions around SRE practices, identifying opportunities to expand SRE adoption and value delivery.
- Partner with internal teams to identify cross-sale and cross-collaboration opportunities within Ensono.
- Engage with suppliers and third-party vendors for support, enhancements, and operational improvements.
Technical Skills & Expertise :
- Strong commercial experience with Infrastructure as Code tools such as Terraform (preferred), ARM, or Bicep.
- Hands-on expertise with CI/CD tooling including Azure DevOps, GitHub Actions, GitLab; experience with Harness is a plus.
- Proficiency with monitoring and observability tools such as Datadog, Splunk, New Relic, and Azure Monitor.
- Demonstrable experience across multiple core technologies, including .NET, Java, JavaScript, and AI/Data Engineering.
- Strong troubleshooting skills with the ability to identify systemic failures and performance bottlenecks.
- Solid understanding of cloud-native and hybrid architectures, primarily on Microsoft Azure.
Leadership & Behavioral Skills :
- Proven ability to lead technical discussions, mentor engineers, and drive best practices across teams.
- Strong stakeholder management and communication skills, especially in client-facing environments.
- Outcome-driven mindset with a focus on reliability, scalability, and operational excellence.
Qualifications & Certifications :
- 12+ years of overall industry experience in SRE, DevOps, or platform engineering roles.
- Azure Associate-level and DevOps Engineer certifications are highly beneficial.
- CKAD certification is highly desirable or required within the probationary period.
Equal Opportunity Statement :
Ensono is an equal opportunity employer. All qualified applicants will be considered without regard to caste, colour, creed, religion, gender, gender identity, sexual orientation, age, disability, HIV status, or any other status protected by law. Candidates requiring accommodations during the recruitment process are encouraged to contact the Talent Acquisition team.
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1605363