Posted on: 05/12/2025
Description :
Location : On-site Gurgaon (Hybrid)
Department : Technology / Engineering
Experience Level : 8+ Years
Employment Type : Full-Time
ABOUT THE ROLE :
We are looking for a highly skilled Lead DevOps Engineer to join our team and help build, scale, and maintain a reliable messaging platform that powers seamless communication for millions of users.
Youll be responsible for designing cloud-native infrastructure, automating deployments, ensuring high availability, and driving operational excellence in a fast-paced environment.
KEY RESPONSIBILITIES :
Infrastructure & Deployment :
- Design, implement, and manage scalable, resilient cloud infrastructure (AWS/GCP/Azure) for messaging workloads.
- Containerize applications (Docker/Kubernetes) and optimize orchestration for performance.
Reliability & Monitoring :
- Ensure high availability and low latency of the messaging platform with proactive monitoring and alerting (Prometheus, Grafana, ELK, Datadog, etc.).
- Troubleshoot production issues, perform root cause analysis, and implement long-term fixes.
- Define and track SLOs/SLAs/SLIs for messaging services.
Automation & Security :
- Automate provisioning, scaling, and failover processes using Infrastructure as Code (Terraform, Ansible, Helm).
- Enforce best practices for system security, secrets management, and compliance.
- Implement disaster recovery, backup strategies, and incident response playbooks.
Collaboration & Culture :
- Work closely with developers, SREs, and QA teams to deliver reliable features.
postmortems.
- Contribute to documentation and knowledge sharing across teams.
Required Skills & Qualifications :
- 8+ years of experience in DevOps/SRE/Cloud Engineering roles.
- Strong experience with Kubernetes, Docker, and CI/CD pipelines.
- Hands-on expertise in cloud platforms (AWS/GCP/Azure) and Infrastructure as Code
(Terraform/CloudFormation).
- Solid background in Linux systems, networking, and messaging protocols (e.g., Kafka, RabbitMQ, MQTT, WebSockets, or similar).
- Experience with monitoring, logging, and observability stacks.
- Knowledge of scripting/programming (Python, Bash, Go, etc.).
Preferred Skills :
- Experience with real-time, high-throughput systems (messaging, streaming, or event-driven architectures).
- Familiarity with security best practices in distributed systems.
Why Join Us?
- Opportunity to build and scale a mission-critical messaging platform used globally.
- Work with a passionate, talented team driving innovation in cloud-native communication.
- Competitive salary.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1584836
Interview Questions for you
View All