Posted on: 22/12/2025
Description :
Job Summary :
We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Infrastructure as Code (IaC) using Terraform.
The role focuses on building, automating, and maintaining scalable, reliable, and secure cloud infrastructure while improving system reliability, observability, and operational excellence.
You will work closely with software engineering, DevOps, and security teams to ensure high availability and performance of production systems.
Key Responsibilities :
- Design, implement, and maintain highly available, fault-tolerant, and scalable infrastructure
- Define and manage SLIs, SLOs, and SLAs to ensure system reliability
- Monitor system health, performance, and availability using observability tools
- Participate in on-call rotations and handle incident response, root cause analysis, and postmortems
- Drive continuous improvement in system reliability and operational efficiency
- Design, develop, and maintain Terraform modules and environments for cloud infrastructure
- Enforce IaC best practices including code reviews, versioning, and modular design
- Automate provisioning and configuration of cloud resources using Terraform
- Manage Terraform state, workspaces, and environment separation
- Integrate Terraform workflows into CI/CD pipelines
- Build automation for deployments, scaling, and operational tasks
- Collaborate with development teams to improve release reliability and deployment strategies
- Implement blue/green, canary, or rolling deployments
- Improve infrastructure security, cost optimization, and compliance
- Work with architects and engineers to design cloud-native and microservices architectures
- Provide guidance on reliability, scalability, and operational best practices
- Document infrastructure, runbooks, and operational procedures
Required Skills & Qualifications :
- Bachelors degree in Computer Science, Engineering, or related field
- 4+ years of experience in SRE, DevOps, or Cloud Engineering roles
- Strong hands-on experience with Terraform for infrastructure provisioning
- Experience with cloud platforms : AWS, Azure, or GCP
- Solid understanding of Linux systems, networking, and security fundamentals
- Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins)
- Proficiency in scripting/programming (Python, Bash, or Go)
- Experience with monitoring and observability tools (Prometheus, Grafana, ELK, Datadog, CloudWatch)
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1594058
Interview Questions for you
View All