We are seeking a seasoned Site Reliability Engineer (SRE) Engineer to join our growing team.
This is a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure on AWS. You will leverage your expertise in automation, infrastructure management, and cost optimization to build and maintain resilient systems that support our business objectives. This role requires a proactive, results-oriented individual with a passion for building and maintaining robust, scalable systems.

Responsibilities :

- Design, deploy, and manage highly available and scalable infrastructure on AWS.

- Automate infrastructure provisioning and configuration using tools like Terraform and Ansible.

- Develop and implement monitoring and alerting systems to proactively identify and troubleshoot incidents.

- Optimize infrastructure costs on AWS through resource management and utilization analysis

- Collaborate with development teams to implement DevOps practices and ensure smooth deployments.

- Participate in on-call rotations and diligently respond to incidents to minimize downtime

- Continuously improve infrastructure reliability and performance through automation and best practices.

- Stay up-to-date with the latest trends and technologies in cloud computing and SRE principles.

Qualifications :

- 4+ years of experience in Site Reliability Engineering or a related field (Devops)

- Proven expertise in deploying and managing infrastructure on AWS (EC2, S3, VPC, etc.)

- Experience in Linux OS is a must. Prior experience as a Linux administrator a plus.

- Strong understanding of networking fundamentals is a must.

- Strong knowledge of infrastructure automation tools like Terraform and Ansible

- Experience with DevOps methodologies and CI/CD pipelines

- A keen understanding of cost optimization principles in AWS

- Excellent problem-solving and analytical skills

- Ability to work independently and as part of a cross-functional team

- Diligent and proactive approach to incident response

- Willingness to participate in on-call rotations

Good to have :

- Experience with SOC compliance frameworks (SOC 2, HIPAA, etc.)

- Experience with container orchestration tools (Kubernetes)

Did you find something suspicious?

Posted By

Nitin Singhal

Director - Recruitments at Zealant Consulting Group

Last Active: 5 Dec 2025

Job Views:
524

Applications: 270

Recruiter Actions: 235

Posted in

DevOps / SRE

Functional Area

Site Reliability Engineering

Job Code

1582513

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers