Posted on: 10/07/2025
Responsibilities :
- Architect and manage highly available Kubernetes clusters using AWS EKS with optimized auto-scaling and secure networking.
- Leverage Docker for containerization and implement secure CI/CD workflows using tools like Jenkins or Devtron.
- Define and manage infrastructure using Terraform, maintaining reusable and version-controlled modules.
- Build, monitor, and optimize core AWS services : ALB/NLB, Security Groups, IAM (Roles & Policies), VPC, Route53 KMS, SSM, Patch Manager, and more.
- Ensure infrastructure is compliant with high availability, cost-efficiency, and security standards across environments.
- Enforce CIS Benchmarking standards for Kubernetes, EC2 IAM, and other AWS resources.
- Implement container image hardening practices (e. g., scanning for CVEs, minimal base images, signed images, SBOM integration).
- Configure cloud-native security controls (e. g., KMS, IAM boundaries, GuardDuty, SecurityHub) to enforce access control and encryption policies.
- Collaborate with InfoSec to manage vulnerability scans, incident playbooks, VAPT responses, and compliance posture reporting.
- Drive full-stack observability using DataDog, CloudWatch, Grafana, and OpenTelemetry (OTEL)based open-source pipelines.
- Build monitoring solutions to track infrastructure health, performance, latency, and alerts with actionable dashboards.
- Define and maintain SLIs, SLOs, error budgets, and automated alert escalation processes.
- Provision and manage highly available data stores : MongoDB, MySQL/Aurora, Redis, DocumentDB, ClickHouse.
- Design and implement backup policies, patching processes, disaster recovery (RTO/RPO), and scaling strategies for all critical data systems.
- Own end-to-end CI/CD workflows using Jenkins, Devtron, or similar platforms.
- Automate patching, configuration drift detection, resource tagging, and access lifecycle workflows.
- Create internal tooling and scripts to manage infrastructure, cost audits, access reviews, and deployment flows.
- Lead and mentor a team of DevOps engineers, fostering ownership, growth, and collaboration.
- Drive technical architecture decisions, enforce best practices, and streamline operational playbooks.
- Conduct RCA for outages/incidents and lead cross-functional response improvements.
Requirements :
- 6+ years of experience in DevOps, SRE, or Infrastructure Engineering roles, with 2+ years in a leadership or staff capacity.
- Strong hands-on expertise in AWS, Terraform, Kubernetes (EKS), Docker, and Linux internals.
- Deep understanding of cloud security, IAM practices, encryption, CIS benchmarks, and image hardening strategies.
- Experience with observability tools like DataDog, Grafana, CloudWatch, and OTEL open-source alternatives.
- Proven experience managing relational and NoSQL databases : MongoDB, MySQL/Aurora, Redis, DocumentDB, ClickHouse.
- Experience implementing and managing infrastructure under compliance standards like SOC2 ISO27001 or similar.
- Certifications : AWS DevOps Professional, AWS Solutions Architect, HashiCorp Terraform Associate.
- Experience with automated compliance audits, security event correlation, and incident management tools.
- Familiarity with tools like Cloudflare, Devtron, Prometheus, and FluentBit.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1510837
Interview Questions for you
View All