Posted on: 30/10/2025
Description :
Job Description :
Job Summary :
Key Responsibilities :
Operational Excellence & SRE :
- Drive Site Reliability Engineering (SRE) practices, including SLIs, SLOs, SLAs, error budgets, and automation of operational tasks.
- Manage incident response, root cause analysis, and post-incident reviews to strengthen platform resilience.
- Build and optimize observability and monitoring frameworks (CloudWatch, Grafana, Loki, Tempo, Prometheus).
- Implement self-healing systems and automated recovery where possible.
- Oversee OS patching to ensure no outstanding vulnerabilities, and maintain compliance with security standards.
Hands-on Cloud & Systems Engineering :
- Provision, manage, and troubleshoot AWS services such as EC2, ECS, EKS, Lambda, ELB, S3, EFS, RDS, VPC, and IAM.
- Hands-on administration of Linux and Windows operating systems, including hardening,
patching, and vulnerability remediation. Confidential.
- Troubleshoot complex issues across infrastructure, applications, networks, and operating
systems.
- Deploy and manage container-based workloads (ECS, EKS, Docker).
- Automate operations using Infrastructure-as-Code (CloudFormation, Terraform) and scripting (Python, Ansible, Bash, PowerShell).
- Implement and optimize GitLab CI/CD pipelines for operations-driven automation.
- Support cloud security, IAM, encryption, and compliance standards.
Basic Qualifications :
- 8+ years of experience in cloud operations, engineering, or SRE roles.
- Strong hands-on expertise with AWS services (EC2, ECS, EKS, Lambda, ELB, S3, EFS, VPC, IAM).
- Good experience with Linux and Windows operating systems, including hardening and
patching.
- Proficiency with scripting languages (Python, Ansible, Bash, PowerShell).
- Hands-on experience in container-based deployments (ECS, EKS, Docker).
- Proven ability in infrastructure and application troubleshooting.
- Deep knowledge of SRE principles, including monitoring, incident management, and
SLIs/SLOs/SLAs.
- Strong expertise in GitLab CI/CD and automation frameworks (CloudFormation, Terraform).
- Working knowledge of cloud security, IAM, and encryption practices.
- Excellent problem-solving, debugging, and communication skills.
Preferred Qualifications :
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1567241
Interview Questions for you
View All