We are seeking a skilled Site Reliability Engineer (SRE) / DevOps Engineer to join our infrastructure team. In this role, you will design, build, and maintain scalable infrastructure, CI/CD pipelines, and observability systems to ensure high availability, reliability, and security of our services.

You will work cross-functionally with development, QA, and security teams to automate operations, reduce toil, and enforce best practices in cloud-native environments.

Responsibilities :

- Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using Infrastructure as Code (Terraform).

- Maintain and improve CI/CD pipelines using tools like CircleCI, GitLab CI, or ArgoCD.

- Ensure high availability and performance of services using Kubernetes (GKE/EKS/AKS) and container orchestration.

- Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools.

- Collaborate with developers to optimize application performance and deployment processes.

- Manage and automate security controls such as IAM, RBAC, network policies, and vulnerability scanning.

Requirements :

- Strong knowledge of Linux.

- Experience with scripting languages such as Python, Bash, or Go.

- Experience with cloud platforms (GCP preferred, AWS or Azure acceptable).

- Proficient in Kubernetes operations, including Helm, operators, and service meshes.

- Experience with Infrastructure as Code (Terraform).

- Solid experience with CI/CD pipelines (GitLab CI, CircleCI, ArgoCD, or similar).

- Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.).

- Knowledge of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls).

- Understanding of security best practices (Secrets Management, IAM, Zero Trust).

Preferred Qualifications :

- Experience with advanced networking solutions.

- Familiarity with SRE principles such as SLOs, SLIs, and error budgets.

- Exposure to multi-cluster or hybrid-cloud environments.

- Knowledge of service meshes (Istio).

- Experience participating in incident management and postmortem processes.

Did you find something suspicious?

Posted By

Saba Rafiq

Sr Staffing Specialist at Pylon Management Consulting

Last Active: 26 Nov 2025

Job Views:
69

Applications: 30

Recruiter Actions: 0

Posted in

DevOps / SRE

Functional Area

DevOps / Cloud

Job Code

1509235

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers