Posted on: 04/08/2025
Key Responsibilities :
- AWS Cloud Operations : Use extensive hands-on experience with AWS to manage and maintain our cloud infrastructure. You should have a good understanding of key services like CloudFormation, KMS, S3, EC2, CloudWatch, and IAM.
- Infrastructure as Code (IaC) : Work with Terraform to manage and provision our infrastructure, and be able to debug and modify existing code as needed.
- CI/CD & DevOps : Fully understand and have hands-on experience with SCM (Source Code Management) and CI/CD principles. You will be responsible for managing workflows and pipelines.
- Monitoring & Logging : Use your hands-on experience with Splunk and Dynatrace/New Relic to monitor system health and troubleshoot issues. You should be able to build custom indexes and dashboards to provide clear insights.
- Troubleshooting & Root Cause Analysis : Analyze logs and system behavior to quickly identify and troubleshoot issues, ensuring minimal downtime.
- Cost Optimization : Understand AWS costing models and work to optimize our architecture to improve efficiency and reduce costs.
- Secrets & Artifact Management : Utilize AWS secrets management services and have the ability to understand other secrets management systems.
You should also be able to work with Docker and image artifactory via Jfrog.
- Scripting & Automation : Be able to debug any code and make modifications where needed, with a focus on Python.
- Documentation : Maintain and improve SRE documentation to assist both the team and end-users, ensuring all processes are clearly defined.
Required Skills & Qualifications :
- 5 to 9 years of experience in a similar SRE or Operations role.
- Extensive hands-on experience with AWS cloud platforms.
- Good understanding of core AWS servicesc : CloudFormation, KMS, S3, EC2, CloudWatch, and IAM.
- Experience with secrets management services from AWS.
- Proficiency in Python and Terraform for automation and IaC.
- Strong understanding of SCM and CI/CD principles.
- Hands-on experience with logging and monitoring tools like Splunk and Dynatrace/New Relic.
- Familiarity with Docker and Jfrog.
- Good understanding of GitHub and GitHub Actions workflow.
- Excellent problem-solving, analytical, and documentation skills.
- Bachelor's degree in Computer Science or a related field (Master's preferred)
Did you find something suspicious?
Posted By
Deepthi Anupula
Talent Acquisition Specialist at Apps Associates (I) Pvt. Ltd
Last Active: 24 Nov 2025
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1524713
Interview Questions for you
View All