HamburgerMenu
hirist

Senior Engineering Manager - DevOps & Site Reliability

SOCH STREET ADVISORS LLP
Bangalore
9 - 17 Years

Posted on: 01/09/2025

Job Description

DevOps Strategy & Leadership :

- Define and implement the DevOps roadmap, ensuring alignment with business goals.

- Lead a team of SRE engineers, driving automation and operational excellence.

- Foster a DevOps culture across engineering teams by embedding reliability, security, and automation best practices.


Infrastructure & Cloud Management :


- Architect and manage cloud-based infrastructure (AWS, GCP, or Azure).

- Ensure scalability, high availability, and security of production environments.

- Optimize cloud costs & resource utilization through effective monitoring and governance.


CI/CD & Automation :


- Design and implement CI/CD pipelines for automated builds, testing, and deployments.

- Implement Infrastructure as Code (IaC) using Ansible, Terraform, CloudFormation, Pulumi.

- Enhance deployment strategies with canary releases, blue-green deployments, and rollback mechanisms.


Observability & Performance Optimization :


- Establish monitoring, logging, and alerting frameworks using tools like Prometheus, Grafana, Sensu, ELK, Datadog, or New Relic.

- Improve system reliability, uptime, and incident response through proactive observability and root cause analysis.

- Optimize database and application performance tuning for seamless operations.


Security & Compliance :


- Implement DevSecOps best practices, integrating security into CI/CD pipelines.

- Ensure compliance with SOC2, ISO 27001, GDPR, and cloud security standards.

- Enforce IAM, network security, encryption, and vulnerability management policies.


Incident & Disaster Recovery Management :


- Define incident response & disaster recovery strategies to minimize downtime.

- Develop and maintain runbooks & playbooks for operational excellence.

- Implement chaos engineering practices to proactively test system resilience.


Key Skills & Qualifications :


- 10+ years of experience in DevOps, SRE, or Cloud Infrastructure roles.

- Expertise in cloud platforms (AWS, GCP, Azure) with strong knowledge of Kubernetes, Docker, and container orchestration.

- Deep experience in CI/CD tools like Jenkins, GitLab CI/CD, ArgoCD, or Spinnaker.

- Strong people management experience and experience working with cross functional non-engineering teams

- Preferred someone who has experience in interacting with clients IT teams about deployment architecture and security related topics.

- Strong knowledge of Infrastructure as Code (Terraform, CloudFormation, Pulumi).

- Hands-on experience with observability tools (Prometheus, Grafana, Datadog, ELK).

- Strong understanding of networking, security, and compliance frameworks.

- Experience leading high-performing DevOps teams and collaborating with engineering & security teams.

- Proficiency in Python, Bash, or Go for automation & scripting.

- Strong analytical and troubleshooting skills to improve system uptime & performance.


info-icon

Did you find something suspicious?