HamburgerMenu
hirist

Job Description

Description :

About the job :

We are seeking a Lead DevOps Engineer to design, implement, and operate scalable cloud infrastructure and delivery platforms across Google Cloud and Azure environments.

The role requires hands-on expertise in modern CI/CD systems, Kubernetes platforms, infrastructure as code, and distributed data systems, along with the ability to guide teams, establish standards, and drive operational excellence.

Cloud & Infrastructure :

- Architect and manage multi-cloud infrastructure across Google Cloud Platform and Microsoft Azure

- Design secure, highly available, and cost-optimized networking and compute architectures

- Implement infrastructure automation using Terraform and OpenTofu

- Establish platform standards, reusable modules, and governance practices

- Drive reliability practices including monitoring, alerting, capacity planning, and incident response

CI/CD & Platform Engineering :

- Build and maintain scalable CI/CD pipelines using Harness, Azure DevOps, Google Cloud Build, and Devtron

- Implement trunk-based and release-based deployment strategies

- Enable progressive delivery, rollback strategies, and release observability

- Maintain artifact lifecycle management and environment promotion workflows

Containers & Orchestration :

- Design and operate Kubernetes platforms (GKE, AKS, OpenShift, MicroK8s)

- Build reusable deployment templates and GitOps workflows

- Implement cluster security, policy enforcement, and workload isolation

- Support application onboarding and developer platform experience

Systems Administration :

- Manage Linux systems (Ubuntu, RHEL) including patching, hardening, and performance tuning

- Automate provisioning and configuration management

- Maintain operational documentation and runbooks

Data & Streaming Platforms :

- Deploy and operate PostgreSQL databases in production environments

- Manage distributed data systems including StarRocks and Apache Flink

- Implement backup, recovery, and high availability strategies

Observability :

Loki, Grafana , Mimir , Thanos

Leadership :

- Lead DevOps initiatives and mentor engineers

- Define operational best practices and SRE standards

- Collaborate with development, security, and architecture teams

- Drive continuous improvement in reliability, scalability, and developer productivity

Required Skills & Experience :

- Strong hands-on experience in Google Cloud Platform and Microsoft Azure

- Deep expertise in Kubernetes (production operations and troubleshooting)

- Proven experience with CI/CD tools : Harness, Azure DevOps, Google Cloud Build, or Devtron

- Infrastructure as Code using Terraform or OpenTofu

- Linux system administration (Ubuntu and RHEL)

- Experience managing databases and streaming/data platforms (PostgreSQL, Flink, StarRocks)

- Strong scripting skills (Bash, Python, or equivalent)

- Experience designing high-availability and fault-tolerant systems

Preferred Qualifications :

- Experience implementing GitOps workflows

- Observability stack implementation (metrics, logs, tracing)

- Security and compliance automation

- Multi-cluster and multi-region architecture experience

- Cost optimization and capacity planning expertise

Leadership Expectations :

- Own platform reliability and operational maturity

- Mentor team members and review architecture decisions

- Drive automation over manual processes

- Champion operational excellence and engineering best practices

Experience :

- 10+ years in DevOps / SRE / Platform Engineering

- Prior experience in a lead or senior ownership role


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in