Posted on: 10/12/2025
Description :
Responsibilities :
- Proactively engage and work in alignment with cross-functional colleagues to understand their requirements, contributing to and providing suitable supporting solutions.
- Develop and introduce systems to aid and facilitate rapid growth, including implementation of deployment policies, designing and implementing new procedures, configuration management, and planning of patches, and for capacity upgrades
- Observability : ensure suitable levels of monitoring and alerting are in place to keep engineers aware of issues.
- Establish runbooks and procedures to keep outages to a minimum.
- Jump in before users notice that things are off track, then automate it for the future.
- Automate everything so that nothing is ever done manually in production.
- Identify and mitigate reliability and security risks.
- Make sure we are prepared for peak times, DDoS attacks, and fat fingers.
- Troubleshoot issues across the whole stack - software, applications, and network.
- Manage individual project priorities, deadlines, and deliverables as part of a self-organizing team.
- Learn and unlearn every day by exchanging knowledge and new insights, conducting constructive code reviews, and participating in retrospectives.
Requirements :
- 2+ years of implementing systems that are highly available, secure, scalable, and self-healing on the Azure cloud platform.
- Strong understanding of networking, especially in cloud environments, along with a good understanding of CICD.
- Prior experience implementing industry-standard security best practices, including those recommended by Azure.
- Proficiency with Bash and any high-level scripting language.
- Basic working knowledge of observability stacks like ELK, prometheus, grafana, Sigooz, etc.
- Proficiency with Infrastructure as Code and Infrastructure Testing, preferably using Pulumi/Terraform.
- Hands-on experience in building and administering VMs and Containers using tools such as Docker/Kubernetes.
- Excellent communication skills, spoken as well as written, with a demonstrated ability to articulate technical problems and projects to all stakeholders
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1588013
Interview Questions for you
View All