Posted on: 27/11/2025
Description :
Key Responsibilities :
- Manage, deploy, and optimize applications using Kubernetes in production environments.
- Work extensively with AWS services including IAM, EC2, EKS, S3, CloudWatch, and related cloud infrastructure components.
- Develop automation scripts and tools using Shell or Python to improve reliability, reduce toil, and enable self-healing mechanisms.
- Troubleshoot complex issues related to applications, networking, system performance, and low-latency environments.
- Perform Linux system debugging, optimization, and performance tuning using advanced tools and techniques.
- Create robust monitoring and alerting frameworks for high-performance systems.
- Collaborate with cross-functional teams to ensure smooth deployment, scalability, and reliability of services.
- Implement and follow SRE principles including monitoring, alerting, incident management, error budgets, fault analysis, capacity planning, and automation.
- Participate in on-call rotations to ensure 24/7 availability and rapid response to incidents.
Required Skills & Qualifications :
- 5+ years of hands-on experience in SRE, DevOps, or Infrastructure Engineering roles.
- Strong experience managing large-scale infrastructure in medium to large networks.
- Deep understanding of Kubernetes, AWS cloud infrastructure, and Linux systems.
- Proficiency in scripting/programming using Shell or Python.
- Strong analytical, troubleshooting, and problem-solving abilities.
- Excellent collaboration and interpersonal skills.
- Ability to thrive in a fast-paced, evolving technology environment.
- Bachelor's degree in Computer Science, Engineering, or related field.
- Willingness to upskill and stay current with emerging technologies
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1581845
Interview Questions for you
View All