HamburgerMenu
hirist

Xebia - DevOps/Site Reliability Engineer

Xebia IT Architects India Pvt Ltd
6 - 8 Years
rupee35-40 LPA
Multiple Locations

Posted on: 01/04/2026

Job Description

Description :

Site Reliability Engineer : (We are hiring across Xebia locations : Gurugram, Hyderabad, Bhopal, Chennai, Bengaluru, Jaipur, Pune.)

Ready to take ownership of world-class cloud reliability and engineering excellence? If you love building and optimizing cloud platforms, solving complex distributed-system challenges, and empowering engineering teams through automation and reliability practices, this role gives you the scope and autonomy to make a real technical impact.

What You'll Be Doing :

As a Site Reliability Engineer, you'll help shape and strengthen cloud infrastructure, reliability engineering practices, and operational excellence. You'll work hands-on across AWS, container orchestration, observability platforms, and CI/CD ecosystems to ensure our systems are resilient, secure, and optimized for scale.

Responsibilities :

- Drive architectural and technical decision-making, ensuring infrastructure and platform designs support long-term scalability, reliability, and security.

- Partner with Delivery to plan and prioritize platform and infrastructure work for maximum technical and operational impact.

- Mentor engineers and uplift technical capability, championing strong engineering practices and continuous improvement.

- Shape technical strategy by contributing to architectural roadmaps, standards, and patterns balancing innovation with long-term risk and resilience.

- Embed quality, security, performance, and compliance into all engineering designs, processes, and operational workflows, ensuring reliability at scale.

Interested ? Here's what you'll need to be successful :

- 5+ years experience in a DevOps or SRE role, ideally within AWS-based environments.

- Strong proficiency with AWS CDK and Infrastructure as Code to deploy and optimize cloud infrastructure.

- Hands-on experience with Docker and container orchestration such as Kubernetes (EKS) or Amazon ECS.

- Proven experience building and maintaining CI/CD pipelines using GitLab, Jenkins, or similar tooling.

- Deep knowledge of monitoring, observability, and logging tools such as Prometheus, Grafana, AppDynamics,, and OpenSearch.

- Proficiency in Python, TypeScript, or Java for building automation, tooling, and reliability improvements.

- Solid understanding of cloud security, including WAF, patching, vulnerability management, and AWS Shield.

- Working knowledge of message queues and streaming technologies such as RabbitMQ, Kinesis, or Kafka.

- Strong analytical and operational problem-solving skills, with the ability to identify performance constraints, eliminate single points of failure, and scale distributed systems.

- Experience participating in incident response, including root-cause analysis and driving long-term reliability improvements.

- Excellent communication and collaboration skills to work effectively across architecture, delivery, and engineering teams.

Focus on :

- AWS CDK preferably with typescript.

- Monitoring stack with Open telemetry.

- API gateway technology.

- OS : Linux skills

- CI/CD using Jenkins and gitlab.


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in