Posted on: 28/11/2025
Description :
What You'll Do :
- Lead and Scale Platform Teams: Manage and grow high-performing engineering teams working on core control plane services and observability infrastructure.
- Own the Cloud Control Plane: Architect and operate scalable control plane services, including service orchestration, feature flag systems, configuration propagation, tenancy-aware deployments, and health monitoring.
- Build a World-Class Observability Stack: Own logging, metrics, tracing, alerting, and visualization to support both developer productivity and system reliability.
- Drive Operational Excellence: Establish SLOs, improve MTTR/MTTD, and embed resilience across the platform.
- Partner Across the Org: Collaborate with SRE, Security, Application, and Product Engineering teams to ensure platform services meet evolving business needs.
- Architect for Multi-Tenant SaaS: Enable secure and efficient scaling across tenants in AWS and GCP, with attention to cost, compliance, and observability.
- Contribute Hands-On: Participate in architecture reviews, deep dive into production issues, and mentor engineers on best practices in system design and debugging.
What You Bring :
- 12+ years of engineering experience, with at least 5 years in platform/infrastructure leadership roles.
- Expertise in Kubernetes, service meshes, CI/CD pipelines, and cloud-native architecture.
- Proven experience with control plane engineering, including service discovery, dynamic config, scaling orchestration, and policy enforcement.
- Deep understanding of observability tooling (e.g , Prometheus, Grafana, OpenTelemetry, Datadog, Elastic, etc.
- Familiarity with distributed systems concepts like CAP theorem, consensus, and leader election.
- Experience operating multi-tenant systems in AWS and/or GCP environments.
- Hands-on experience with at least one major programming language (Go, Java, Python).
- Strong stakeholder management and the ability to influence architectural direction across orgs.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1582121
Interview Questions for you
View All