AI/ML

Artificial Intelligence

Machine Learning

Security Architect - AI

Cloud Architect - ML/AI

Emerging Technologies

DevOps / SRE

CyberSecurity

Quality Assurance

Platform Engineering / SAP/Oracle

Staff Platform Engineer - Site Reliability

First American (India) Pvt Ltd

11 - 15 Years

Bangalore

Site Reliability Python IAC Terraform CI/CD Pipeline Kubernetes Docker Observability Services Cloud Infrastructure Incident Management AWS Azure

Posted on: 02/01/2026

Job Description

About the Role :

We are looking for an experienced Staff Platform Engineer with a strong Site Reliability Engineering (SRE) mindset to join our Platform Engineering team. This role is critical in building resilient, scalable, and secure platforms that empower development teams to deliver high-quality software efficiently.

As a Staff Engineer, you will lead initiatives to improve reliability, observability, and operational excellence across our platforms.

You will design and implement solutions that automate infrastructure, optimize performance, and ensure high availability. This position requires a balance of deep technical expertise, leadership skills, and a passion for reliability engineering.

Key Responsibilities :

- Architect and implement highly available, scalable, and secure cloud platforms (AWS, Azure, GCP).

- Drive SRE practices : implement SLIs, SLOs, and error budgets to improve reliability and performance.

- Enhance observability : build advanced monitoring, logging, and alerting systems for proactive issue detection.

- Automate everything : infrastructure provisioning, deployments, and operational tasks using IaC and scripting.

- Lead incident management and postmortems, ensuring root cause analysis and continuous improvement.

- Collaborate with development and operations teams to embed reliability into the software lifecycle.

- Mentor engineers, fostering a culture of operational excellence and innovation.

- Contribute to technical roadmap, aligning platform capabilities with organizational goals.

Key Requirements :

- 12+ years of experience in Platform Engineering, SRE, or DevOps roles.

- Strong application development background (5+ years in .NET and Java).

- Proven experience as a technical lead, driving design and architecture decisions.

- Expertise in AWS and Azure infrastructure and services.

- Advanced scripting skills (Python preferred).

- Deep knowledge of IaC tools (Terraform, CloudFormation).

- CI/CD pipeline design and implementation (GitHub Actions, ADO, Jenkins, CodePipeline, or similar).

- Containerization and orchestration (Docker, Kubernetes).

- Version control systems (Git, Bitbucket, TFS).

- Configuration management tools (Ansible, Chef, Puppet).

- Hands-on experience with code reviews, design reviews, and technical governance.

Nice to Have :

- AWS or Azure certifications.

- Experience with serverless architectures and automation.

- Familiarity with GitOps workflows and progressive delivery strategies.