Posted on: 13/07/2025
Role : Senior Site Reliability Engineer (SRE)
Locations : Pune, Bangalore, Hyderabad, Chennai, Ahmedabad, Noida
Experience: 6-7 years
Employment Type: Permanent
Key Responsibilities :
- Design, implement, and maintain highly available, scalable, and secure cloud infrastructure on Google Cloud Platform (GCP).
- Develop and implement CI/CD pipelines using GCP Cloud Build and other cloud-native services to ensure rapid and reliable software deployments.
- Automate operational tasks, infrastructure provisioning, and application deployments using Infrastructure as Code (IaC) tools, primarily Terraform.
- Monitor system performance, troubleshoot complex issues, and participate in on-call rotations to ensure the stability and availability of our services.
- Define, monitor, and achieve Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to guarantee service reliability.
- Implement and enforce security best practices, including role-based access control (RBAC), across our cloud environment.
- Collaborate closely with development teams to ensure reliability is designed into new features and services from the outset.
- Manage and optimize logging and monitoring solutions using tools like Grafana, Prometheus, Splunk, and GCP native logging.
- Maintain and manage codebases and configurations effectively using source control tools like GitHub Enterprise.
- Drive continuous improvement initiatives to enhance system performance, reduce operational overhead, and improve incident response.
- Provide off-hours support as needed to address critical incidents and ensure system stability.
Qualifications :
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
- Minimum of 6 years of IT experience in an engineering role, with at least 2-3 years dedicated to Cloud SRE/Engineer positions.
Required Skills :
Hands-on experience with Google Cloud Platform (GCP) services including :
- Cloud Build
- Cloud Functions
- Cloud Logging
- Cloud Monitoring
- Google Cloud Storage (GCS)
- Cloud SQL
- Identity and Access Management (IAM)
- Proficiency in Python.
- Experience with a secondary language such as Golang or Java.
- Proven ability to manage codebases and configurations effectively.
Key Skills :
- Strong knowledge and hands-on experience with Docker.
- Experience in implementing and maintaining robust CI/CD pipelines using GCP nd other cloud-native services.
- Proficiency in IaC tools, especially Terraform.
- Solid understanding of security best practices and Role-Based Access Control (RBAC).
- Experience in defining, monitoring, and achieving Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
- Proficiency with source control tools like GitHub Enterprise.
- A strong commitment to continuous improvement and automation of manual tasks.
- Familiarity with monitoring tools such as Grafana, Prometheus, Splunk, and GCP native logging solutions.
Nice to Have Skills :
- Experience in managing Kubernetes (K8s) workloads.
- Experience in secrets management using HashiCorp Vault.
- Experience with tracing tools like Google Cloud Trace or Honeycomb.
Why Join Us ?
We offer a challenging and rewarding environment where you can make a significant impact on our cloud infrastructure and services. You will work with cutting-edge technologies, collaborate with talented engineers, and have opportunities for continuous learning and career growth. If you are ready to take on this exciting challenge, we encourage you to apply!
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1512270
Interview Questions for you
View All