HamburgerMenu
hirist

Job Description

Role Purpose :


We are seeking an AWS Cloud Infra & L2 DevOps Engineer to support our project by monitoring, maintaining, and ensuring the availability of applications hosted on OpenShift. The role involves proactive monitoring of nodes and pods, handling alerts, and taking preventive and corrective actions to ensure high availability and operational stability in a 24/7 environment..


Key Responsibilities :


- Design, deploy, and manage Red Hat OpenShift on AWS (ROSA) clusters.


- Perform cluster lifecycle management including installation, upgrades, patching, and scaling.


- Integrate OpenShift with AWS services such as EC2, VPC, ELB/ALB, IAM, S3, RDS, and Route 53.


- Implement and manage CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, Argo CD, or Tekton.


- Automate infrastructure provisioning using Terraform and/or CloudFormation.


- Manage container images, registries, and image security scanning.


- Implement security best practices including IAM roles, OpenShift RBAC, network policies, secrets management.


- Monitor and troubleshoot clusters and applications using Prometheus, Grafana, CloudWatch, and EFK/ELK.


- Support application teams with microservices deployments, Helm charts, and Operators.


- Handle incident management, root cause analysis, and production support.


- Ensure high availability, disaster recovery, and backup strategies on AWS.


- Ensure health and availability of nodes, pods, and clusters.


- Proactively monitor system alerts and take precautionary actions to prevent incidents.


- Support 24/7 operations, including shift-based.


- Ensure compliance with security, operational, and organizational standards.


Required Skills & Experience :


- Strong knowledge of AWS Cloud Services.


- EC2, VPC, IAM, ELB/ALB, Auto Scaling.


- S3, RDS/DynamoDB (basic).


- Solid understanding of Kubernetes internals and container orchestration.


- Experience with CI/CD tools : Jenkins, GitLab CI, Argo CD, Tekton.


- Expertise in any of the scripting knowledge (Shell, Bash, Python.,).


- Hands-on experience with application and infrastructure monitoring.


- Working knowledge of ITIL processes (Incident, Problem, Change Management).


- Understanding of networking concepts : VPC, subnets, security groups, DNS, SSL/TLS.


Tools & Technologies :


- Ansible.


- Dynatrace.


- Prometheus.


- Splunk.


- Git.


- ServiceNow.

info-icon

Did you find something suspicious?