Posted on: 02/11/2025
Description :
- Perform root cause analysis on incidents, prepare detailed reports for stakeholders, and design preventive measures to avoid recurrence.
- Develop and maintain tools that extend the functionality of monitoring and observability platforms.
- Continuously identify and address performance bottlenecks and opportunities for process improvement.
- Collaborate closely with developers, DevOps, and infrastructure teams to deliver scalable and resilient systems.
- Automate operational tasks to enhance efficiency and reduce manual intervention.
- Investigate and analyze large data sets using tools like ElasticSearch and Kibana.
- Work with log collection systems such as Logstash or Filebeat to enhance observability and troubleshooting.
- Communicate clearly and effectively during incident management and technical discussions.
- Contribute to knowledge sharing, documentation, and process optimization within the team.
Required Skills & Experience :
- Hands-on experience managing enterprise web applications hosted on IIS.
- Good understanding of Git for version control.
- Proficiency with SQL and Redis.
- Working knowledge of infrastructure and virtualized environments.
- Familiarity with DevOps practices, including automation, monitoring, and CI/CD principles.
- Excellent communication skills in English (oral and written) with the ability to collaborate across teams.
- Understanding of Scrum methodology and exposure to the Product Owner role.
- Willingness to work with .NET environments.
- Good knowledge of Linux OS, including Bash scripting and command-line utilities (a plus).
- Experience with ElasticSearch and Kibana for data analysis (a plus).
- Familiarity with log collection systems like Logstash or Filebeat (a plus).
Whats in It for You :
- Tackle a wide variety of technical challenges daily across multiple customers and technologies.
- Gain significant responsibility and impact in a fast-scaling organization.
- Work in a collaborative environment that encourages creativity, continuous learning, and pragmatic
problem-solving.
- Earn the respect and appreciation of your peers as you help build reliable, powerful, and future-proof systems
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1568727
Interview Questions for you
View All