HamburgerMenu
hirist

Job Description

Responsibilities :

- Bridging the gaps b/w core infra, security, and development team.


- Owning the end-to-end Availability, Performance, and Capacity of applications and their infrastructure, and creating/maintaining the respective observability with DataDog/New Relic/ECS.


- Providing 24X7 infra and app support, building processes, and documenting tribal knowledge at the same time.


- Managing application deployment and AWS ECS platforms - automate and improve development and release processes.


- Creating, managing, and maintaining data stores and data platform infra using IaC.


- Owning and onboarding new applications with the production readiness review process.


- Managing the SLO/Error Budgets/Alerts and performing root cause analysis for production errors.


- Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.


- Identifying observability gaps in application and infrastructure, and working with stakeholders to fix them.


- Managing outages by doing detailed RCA with developers and identifying ways to avoid that situation.


- Automate toil and repetitive work.


Requirements :

- 4 to 6 years of experience in managing large-scale microservices and infrastructure with excellent troubleshooting skills.


- Experience in troubleshooting, managing, and deploying containerized environments using Docker/containers, ECS is a must.


- Must be very hands-on in managing and troubleshooting the AWS environment.


- Extensive experience with Linux administration and a good understanding of the various Linux kernel subsystems (memory, storage, network, etc).


- Good experience in DNS, TCP/IP, UDP, GRPC, Routing, and Load Balancing.


- Expertise in GitOps, Infrastructure as a Code tool such as Terraform, etc., and Configuration Management Tools such as Chef, Puppet, Saltstack, and Ansible.


- Experience working with Cloud Infrastructure solutions like AWS.


- Experience in building CI/CD pipelines.


- Experience with multiple data stores is a plus (Redis, Elasticsearch).


- Must be good in any of the DevOps scripting languages - Python, Ruby, or Go.


info-icon

Did you find something suspicious?