Posted on: 27/10/2025
Description :
AWS Cloud Infrastructure :
- Design, deploy, and manage scalable, secure, and highly available systems on AWS.
- Optimize cloud costs, enforce tagging, and implement security best practices (IAM, VPC, GuardDuty, etc.
- Automate infrastructure provisioning using Terraform or AWS CDK.
- Ensure backup, disaster recovery, and high availability (HA) strategies are in place.
Kubernetes (EKS preferred) :
- Manage and scale Kubernetes clusters (preferably Amazon EKS).
- Implement CI/CD pipelines with GitOps (e., ArgoCD or Flux) or traditional tools (e., Jenkins, GitLab).
- Enforce RBAC policies, namespaces isolation, and pod security policies.
- Monitor cluster health, optimize pod scheduling, autoscaling, and resource limits/requests.
Monitoring and Observability (Datadog) :
- Build and maintain Datadog dashboards for real-time visibility across systems and services.
- Set up alerting policies, SLOs, SLIs, and incident response workflows.
- Integrate Datadog with AWS, Kubernetes, and applications for full-stack observability.
- Conduct post-incident reviews using Datadog analytics to reduce MTTR.
Automation and DevOps :
- Automate manual processes (e., server setup, patching, scaling) using Python, Bash, or Ansible.
- Maintain and improve CI/CD pipelines (Jenkins) for faster and more reliable deployments.
- Drive Infrastructure-as-Code (IaC) practices using Terraform to manage cloud resources.
- Promote GitOps and version-controlled deployments.
Linux Systems Administration :
- Administer Linux servers (Ubuntu, RHEL, Amazon Linux) for stability and performance.
- Harden OS security, configure SELinux, firewalls, and ensure timely patching.
- Troubleshoot system-level issues : disk, memory, network, and processes.
- Optimize system performance using tools like top, htop, iotop, netstat, etc
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1565738
Interview Questions for you
View All