HamburgerMenu
hirist

AIOps/Automation Engineer

KVAS Technologies
Multiple Locations
6 - 8 Years

Posted on: 25/09/2025

Job Description

Job Title : AI Ops / Automation Engineer

Experience : 7+ Years

Work Mode : Work from Office / Hybrid

Job Overview :

We are looking for an experienced AI Ops / Automation Engineer with strong expertise in IT operations automation, monitoring, and AI-driven incident management. The ideal candidate will have hands-on experience with AIOps platforms, DevOps/Cloud automation, and advanced scripting to streamline operations, improve system reliability, and reduce manual intervention.

Key Responsibilities :

- Design, develop, and implement AIOps and automation solutions to enhance IT operations efficiency.

- Integrate monitoring tools, observability platforms, and AIOps solutions for predictive incident detection and resolution.

- Automate repetitive operational tasks using Python, Ansible, Terraform, or equivalent tools.

- Implement event correlation, noise reduction, and root-cause analysis via AIOps platforms.

- Collaborate with DevOps, Cloud, and Infrastructure teams to define and implement automation strategies.

- Ensure scalability, resilience, and performance of production systems through automation.

- Build and maintain CI/CD pipelines with automated monitoring and remediation workflows.

- Drive adoption of ML/AI-based solutions to improve operational visibility and reduce MTTR (Mean Time to Resolution).

- Provide mentorship and technical guidance to junior engineers.

Must-Have Skills :

- 7+ years of experience in IT Operations / Automation / DevOps roles.

- Hands-on experience with AIOps platforms (Moogsoft, Splunk ITSI, Dynatrace, New Relic, BigPanda, or equivalent).

- Strong expertise in scripting/programming (Python, Shell, PowerShell).

- Experience with Infrastructure as Code (IaC) tools such as Ansible, Terraform, or Puppet.

- Good knowledge of CI/CD pipelines, Kubernetes, Docker, and container orchestration.

- Proficiency in monitoring & observability tools : Prometheus, Grafana, ELK, Splunk, AppDynamics, Datadog.

- Strong understanding of incident management, root cause analysis, and ITIL practices.

- Cloud expertise : AWS, Azure, or GCP automation & monitoring.

- Excellent problem-solving skills, analytical mindset, and ability to work in fast-paced environments.

Good to Have :

- Exposure to Machine Learning concepts applied to operations (log analysis, anomaly detection, predictive analytics).

- Knowledge of service mesh (Istio, Linkerd) and cloud-native monitoring frameworks.

- Experience with security automation (SOAR, SIEM integration).

- Prior experience in leading automation initiatives or small teams.


info-icon

Did you find something suspicious?