Posted on: 08/07/2025
We are seeking a skilled Site Reliability Engineer (SRE) / DevOps Engineer to join our infrastructure team. In this role, you will design, build, and maintain scalable infrastructure, CI/CD pipelines, and observability systems to ensure high availability, reliability, and security of our services.
You will work cross-functionally with development, QA, and security teams to automate operations, reduce toil, and enforce best practices in cloud-native environments.
Responsibilities :
- Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using Infrastructure as Code (Terraform).
- Maintain and improve CI/CD pipelines using tools like CircleCI, GitLab CI, or ArgoCD.
- Ensure high availability and performance of services using Kubernetes (GKE/EKS/AKS) and container orchestration.
- Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools.
- Collaborate with developers to optimize application performance and deployment processes.
- Manage and automate security controls such as IAM, RBAC, network policies, and vulnerability scanning.
Requirements :
- Strong knowledge of Linux.
- Experience with scripting languages such as Python, Bash, or Go.
- Experience with cloud platforms (GCP preferred, AWS or Azure acceptable).
- Proficient in Kubernetes operations, including Helm, operators, and service meshes.
- Experience with Infrastructure as Code (Terraform).
- Solid experience with CI/CD pipelines (GitLab CI, CircleCI, ArgoCD, or similar).
- Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.).
- Knowledge of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls).
- Understanding of security best practices (Secrets Management, IAM, Zero Trust).
Preferred Qualifications :
- Experience with advanced networking solutions.
- Familiarity with SRE principles such as SLOs, SLIs, and error budgets.
- Exposure to multi-cluster or hybrid-cloud environments.
- Knowledge of service meshes (Istio).
- Experience participating in incident management and postmortem processes.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1509235
Interview Questions for you
View All