Posted on: 17/10/2025
Description :
- Containerize applications using Docker, and manage orchestration with Kubernetes.
- Collaborate with developers and QA teams to integrate CI/CD pipelines and automate deployment processes.
- Ensure system reliability, uptime, and performance by leveraging industry-leading monitoring tools such as Grafana, Dynatrace, etc.
- Troubleshoot system failures, conduct root cause analysis, and provide long-term solutions to prevent recurrence.
- Script and automate operational tasks using Python or Java to improve system efficiency.
- Maintain documentation of system architecture, procedures, and configurations.
- Participate in incident response and on-call support rotation if required.
Required Skills & Qualifications :
- Minimum 5 years of hands-on experience in a DevOps/SRE role.
- Strong expertise in AWS or Google Cloud Platform (GCP).
- Deep understanding and practical experience with Docker and Kubernetes in production environments.
- Proficient in Java or Python for scripting, automation, and integrations.
- Experience with monitoring tools such as Grafana, Dynatrace, Prometheus, etc.
- Strong problem-solving skills and ability to work in a fast-paced environment.
- Excellent communication and documentation skills.
Must Have Skills :
- AWS, DevOps, Prometheus, Grafana, Splunk, Python Scripting.
- Need experience in dashboards configuration/Setup for monitoring using Splunk, Grafana etc
Preferred Attributes :
- Prior experience in large-scale enterprise systems.
- Ability to work independently and take ownership of DevOps processes.
- Exposure to Agile/Scrum methodologies.
Did you find something suspicious?
Posted By
Sarakshi Pandey
IT Recruiter at VAK Consulting LLC
Last Active: NA as recruiter has posted this job through third party tool.
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1561606
Interview Questions for you
View All