Posted on: 05/09/2025
We are looking for a skilled Site Reliability Engineer (SRE) with a strong DevOps background and deep expertise in Google Cloud Platform (GCP).
The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of production systems while implementing modern DevOps practices.
Responsibilities :
- Design, build, and maintain scalable, reliable infrastructure on GCP.
- Develop and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, etc.
- Implement observability, including monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK, Dynatrace).
- Automate infrastructure using Infrastructure as Code (IaC) tools such as Terraform, Ansible.
- Manage containerized applications using Docker and orchestration tools like Kubernetes.
- Write scripts and automation in languages such as Python, Go, and ash.
- Collaborate with engineering teams to define SLOs, SLIs, and error budgets.
- Participate in incident response, root cause analysis, and system optimization.
Requirements :
- Strong knowledge of Linux/Unix fundamentals.
- Proficient in at least one programming/scripting language (Python, Go, Bash, Java, or JavaScript).
- Experience with Version Control Systems (e.g., Git).
- Hands-on with CI/CD pipelines.
- Deep understanding of cloud environments, specifically GCP.
- Proficiency with IaC tools (Terraform, Ansible, Chef, Puppet).
- Knowledge of containerization & orchestration (Docker, Kubernetes).
- Familiar with monitoring/logging tools: Dynatrace, ELK, OpenSearch, Log Explorer, Prometheus, and Grafana
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1541673
Interview Questions for you
View All