HamburgerMenu
hirist

Site Reliability Engineering - Monitoring Tools

Allianz
Pune
10 - 12 Years
star-icon
4.4white-divider238+ Reviews

Posted on: 10/08/2025

Job Description

Responsibilities :

- Proven experience in an SRE or infrastructure engineering role with a focus on monitoring, automation, and orchestration.

- Expertise in monitoring tools (Prometheus, ELK, Grafana etc,) with ability to optimize monitoring systems and integrate ML/AI models to improve visibility, anomaly detection, and proactive issue resolution.

- Extensive hands-on experience with automation tools such as Terraform, Ansible, and Jenkins, along with proficiency in CI/CD pipelines, to efficiently streamline and optimize network operations and workflows.

- Strong Linux administration skills.

- Good understanding of of Networking and Security domain, with the ability to critically analyse infrastructure designs and propose innovative improvements to enhance performance, reliability, stability and security.


- Extensive hands-on experience with automation tools such as Terraform, Ansible, and Jenkins, along with proficiency in CI/CD pipelines, to efficiently streamline and optimize network operations and workflows.

- Proficiency in scripting languages (Bash, Python, Go).


- Proficiency with containerization and orchestration (Docker, Kubernetes).


- Understanding of cloud platforms such as AWS, Azure, or Google Cloud.

- Familiarity with microservices architecture and distributed systems.


Experience :


- Candidate with below experience.

- Candidate with 10+ years of experience.


- Proven experience in an SRE, DevOps, or infrastructure engineering role with a focus on monitoring, automation, and orchestration.

- Strong knowledge of Networking and Security domain, with the ability to critically analyse network designs and propose innovative improvements to enhance performance, reliability, stability and security.

- Expertise in monitoring tools (Prometheus, ELK) with ability to optimize monitoring systems and integrate ML/AI models to improve visibility, anomaly detection, and proactive issue resolution.

- Extensive hands-on experience with automation tools such as Terraform, Ansible, and Jenkins, along with proficiency in CI/CD pipelines, to efficiently streamline and optimize network operations and workflows.

- Proficiency in scripting languages (Bash, Python, Go).

- Proficiency with containerization and orchestration (Docker, Kubernetes).

- Understanding of cloud platforms such as AWS, Azure, or Google Cloud.

- Familiarity with microservices architecture and distributed systems.

- A fundamental grasp of AI tools will be an added benefit.


Soft Skills :


- Excellent verbal & non verbal communication skills.

- Should be a team player.


- Good analytical and problem-solving skills.


info-icon

Did you find something suspicious?