HamburgerMenu
hirist

Senior Site Reliability Engineer - IAC Terraform

Options Executive Search Pvt Ltd
Hyderabad
8 - 12 Years

Posted on: 27/08/2025

Job Description

Job Title : SRE Lead Engineer.

Location : Hyderabad, India.

We are seeking a DevOps / SRE Lead Engineer to architect and scale our client's multi-tenant SaaS platform with AI/ML at the core.

Our client, a fast-growing AI-powered SaaS company in the FinTech space, is looking for a Site Reliability Engineering (SRE) Lead Engineer to join their dynamic team.

This is an opportunity to design and operate large-scale SaaS systems that integrate cutting-edge AI/ML capabilities.

About the Role :

As the SRE Lead Engineer, you will be responsible for architecting, building, and maintaining infrastructure that powers a multi-tenant SaaS platform.

Youll drive reliability, scalability, and security, while supporting AI/ML pipelines in production.

This is a hands-on role with significant ownership, requiring both technical depth and leadership in site reliability practices.

Key Responsibilities :

- Architect, design, and deploy end-to-end infrastructure for large-scale, microservices-based SaaS platforms.

- Ensure system reliability, scalability, and security for AI/ML model integrations and data pipelines.

- Automate environment provisioning and management using Terraform in AWS (EKS-focused).

- Implement full-stack observability across applications, networks, and operating systems.

- Lead incident management and participate in 24/7 on-call rotation.

- Optimize SaaS reliability while enabling REST APIs, SSO integrations (Okta/Auth0), and cloud data services (RDS/MySQL, Elasticsearch).

- Define and maintain backup and disaster recovery for critical workloads.

Required Skills & Experience :

- 8+ years in SRE/DevOps roles, managing enterprise SaaS applications in production.

- Minimum 1 year experience with AI/ML infrastructure or model-serving environments.

- Strong expertise in AWS cloud, particularly EKS, container orchestration, and Kubernetes.

- Hands-on experience with Infrastructure as Code (Terraform), Docker, and scripting (Python, Bash).

- Solid Linux OS and networking fundamentals.

- Experience in monitoring and observability with ELK, CloudWatch, or similar tools.

- Strong track record with microservices, REST APIs, SSO, and cloud databases.

Nice-to-Have Skills :

- Experience with MLOps and AI/ML pipeline observability.

- Cost optimization and security hardening in multi-tenant SaaS.

- Prior exposure to FinTech or enterprise finance solutions.

Qualifications :

- Bachelors degree in Computer Science, Engineering, or related discipline.

- AWS Certified Solutions Architect (strongly preferred).

- Experience in early-stage or high-growth startups is an advantage.

Why Join?

- Be at the forefront of AI/ML-powered SaaS innovation in FinTech.

- Work with a high-energy, entrepreneurial team building next-gen infrastructure.

- Take ownership of mission-critical reliability challenges.

- Grow your career in an environment that values impact, adaptability, and innovation.


info-icon

Did you find something suspicious?