HamburgerMenu
hirist

Lead DevOps Engineer - Cloud Infrastructure

Scaling Theory Technologies Pvt Ltd
8 - 10 Years
Multiple Locations

Posted on: 27/02/2026

Job Description

Description :


We are looking for a highly experienced Lead DevOps Engineer to own and drive DevOps strategy, cloud infrastructure, and production reliability across enterprise-scale systems. This is a hands-on, client-facing leadership role, requiring deep expertise in cloud platforms, automation, CI/CD, container orchestration, and cost optimization.


The ideal candidate has automation in their DNAsomeone who consistently replaces manual processes with scalable, repeatable, and reliable automation. You will lead DevOps initiatives across multi-cloud and hybrid environments, ensuring high availability, security, performance, and cost efficiency.


Key Responsibilities :


DevOps Leadership & Team Management :


- Lead, mentor, and manage a team of 23 DevOps engineers (Infrastructure & Application DevOps).


- Enforce DevOps best practices, automation-first culture, and operational excellence.


- Conduct code reviews, enable knowledge sharing, and improve DevOps workflows.


Cloud & Infrastructure Ownership :


- Own end-to-end infrastructure across Microsoft Azure (primary), Amazon Web Services, Google Cloud Platform, and hybrid/on-prem environments.


- Architect and manage scalable, resilient, and highly available systems.


- Deep expertise in Azure services including AKS, Azure DevOps, Azure Pipelines, networking, storage, IAM, and monitoring.


Containerization & CI/CD :


- Design and operate containerized workloads using Docker and Kubernetes.


- Build, maintain, and optimize CI/CD pipelines using Azure Pipelines, GitHub Actions, GitLab CI, Jenkins, or ArgoCD.


- Implement Infrastructure as Code (IaC) using Terraform with modular, reusable components.


Production Operations & Reliability :


- Own uptime, reliability, performance, and scalability of production systems.


- Implement observability and monitoring using tools such as Prometheus, Grafana, ELK, Datadog, or New Relic.


- Lead incident response, root cause analysis (RCA), and preventive action planning.


- Perform capacity planning and infrastructure right-sizing.


Cloud Cost Optimization (FinOps) :


- Continuously analyze and optimize cloud spend across Azure, AWS, and GCP.


- Prepare and present weekly, monthly, and quarterly cost reports to leadership and clients.


- Create forecasts, budgets, and cost projections.


- Apply FinOps principles and recommend cost-efficient architectural improvements.


Security, Governance & Compliance :


- Implement cloud security best practices and automated policy enforcement.


- Ensure least-privilege access, secure CI/CD pipelines, and compliance readiness.


- Collaborate with security teams on audits, vulnerability management, and hardening.


Client-Facing Responsibilities :


- Act as the primary technical point of contact for clients.


- Present architecture designs, production metrics, performance updates, and cost insights.


- Translate complex technical concepts into clear, business-friendly communication.


Required Skills & Experience :


Technical Must-Haves :


- 8 to 10 years of DevOps experience, including 2+ years in a lead role.


- Strong hands-on expertise in :


a. Azure, AKS, Azure DevOps, Azure Pipelines


b. Docker & Kubernetes


c. Terraform (Infrastructure as Code)


- CI/CD pipeline design and automation


- Working knowledge of AWS and GCP.


- Experience managing on-premise or hybrid infrastructure.


- Strong understanding of networking (VPC/VNet, routing, load balancers), monitoring, HA, DR, and scaling strategies.


Automation Mindset :


- Proven automation-first approach to infrastructure and operations.


- Experience eliminating manual processes using scripts, pipelines, and IaC.


- Proficiency in scripting languages such as Bash, Python, or PowerShell.


Operational Excellence :


- Experience managing high-traffic, highly available production systems.


- Strong troubleshooting, incident management, and performance optimization skills.


- Expertise in capacity planning and cost optimization.


Soft Skills :


- Excellent written and verbal communication in English.


- Confident client-facing presence.


- Strong leadership, ownership, and mentoring abilities.


Good-to-Have Skills :


- GitOps (ArgoCD, FluxCD)


- Service Mesh (Istio, Linkerd)


- Serverless architectures (Azure Functions, AWS Lambda)


- FinOps or cloud cost governance experience


- Experience supporting AI, Big Data, or advanced workloads

info-icon

Did you find something suspicious?

Similar jobs that you might be interested in