Posted on: 04/09/2025
Organizational :
Overview of Job Role :
Roles & Responsibilities :
Leadership & Strategy :
- Define and implement DevOps strategies aligned with business objectives to enhance scalability, security, and reliability.
- Collaborate with cross-functional teams, including software engineering, security, MLOps, and infrastructure teams, to drive DevOps best practices.
- Establish KPIs and performance metrics for DevOps operations, ensuring optimal system performance, cost efficiency, and high availability.
- Advocate for CPU throttling, auto-scaling, and workload optimization strategies to improve system efficiency and reduce costs.
- Drive MLOps adoption, integrating machine learning workflows into CI/CD pipelines and cloud infrastructure.
- Ensure compliance with ISO 27001 standards, implementing security controls and risk management measures.
Infrastructure & Automation :
- Lead the adoption of Infrastructure as Code (IaC) using Terraform, CloudFormation, and configuration management tools like Ansible or Chef.
- Spearhead automation efforts for infrastructure provisioning, deployment, and monitoring to reduce manual overhead and improve efficiency.
- Ensure high availability and disaster recovery strategies, leveraging multi-region architectures and failover mechanisms.
- Manage Kubernetes (or AWS ECS/EKS) clusters, optimizing container orchestration for large-scale applications.
- Drive cost optimization initiatives, implementing intelligent cloud resource allocation strategies.
CI/CD & Observability :
- Enhance observability and monitoring by implementing tools like CloudWatch, Prometheus, Grafana, ELK Stack, or Datadog.
- Develop robust logging, alerting, and anomaly detection mechanisms to ensure proactive issue resolution.
Security & Compliance (ISO 27001 Implementation) :
- Develop and maintain an Information Security Management System (ISMS) to align with ISO 27001
guidelines.
- Implement secure access controls, encryption, IAM policies, and network security measures to safeguard infrastructure.
- Conduct risk assessments, vulnerability management, and security audits to identify and mitigate threats.
- Ensure security best practices are embedded into all DevOps workflows, following DevSecOps principles.
- Work closely with auditors and compliance teams to maintain SOC2, GDPR, and other regulatory frameworks.
Required Skills and Qualifications :
managerial or leadership role.
- Proven experience managing AWS cloud infrastructure at scale, including EC2, S3, RDS, Lambda, VPC, IAM, and CloudFormation.
- Expertise in Terraform and Infrastructure as Code (IaC) principles.
- Strong background in CI/CD pipeline automation with tools like Jenkins, GitHub Actions, GitLab CI, or CircleCI.
- Hands-on experience with Docker and Kubernetes (or AWS ECS/EKS) for container orchestration.
- Experience in CPU throttling, auto-scaling, and performance optimization for cloud-based applications.
- Strong knowledge of Linux/Unix systems, shell scripting, and network configurations.
- Proven experience with ISO 27001 implementation, ISMS development, and security risk management.
- Familiarity with MLOps frameworks like Kubeflow, MLflow, or SageMaker, and integrating ML pipelines into DevOps workflows.
- Deep understanding of observability tools such as ELK Stack, Grafana, Prometheus, or Datadog.
- Strong stakeholder management, communication, and ability to collaborate across teams.
- Experience in regulatory compliance, including SOC2, ISO 27001, and GDPR.
Professional Attributes :
- Excellent prioritization skills, the ability to work well under pressure, and the ability to multi-task.
Education Qualification :
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1541005
Interview Questions for you
View All