Posted on: 03/04/2026
Senior Cloud Infrastructure Engineer - Kubernetes
Experience : 5 - 9 years | 3+ years with EKS/Kubernetes in production
Location : Office Coimbatore/Bengaluru
About Aivar Innovations :
Aivar is an AI-first technology partner where cutting-edge technology meets industry expertise to supercharge your projects. Our AI-augmented teams accelerate development, reduce time-to-market, and deliver exceptional code quality. We bring together the best minds in tech to craft scalable, repeatable solutions that drive real momentum for your business.
Technical Focus :
Foundational hire for AI Ops stack. Own the entire EKS platform: hardened cluster configurations, Terraform modules, Karpenter GPU-aware autoscaling, multi-tenancy (RBAC, namespace isolation, network policies), multi-region DR, and cost optimization. Build infrastructure that runs Llama 70B at sub-second latency on multi-GPU instances.
Functional Expectations :
- Design hardened EKS clusters : private endpoints, IMDSv2, Pod Security Admission, image scanning, audit logging
- UltraCluster Scale : Experience in building HPCs and Large cluster suitable for managing AI Ops of SLMs to LLMs
- Build Terraform modules for complete Kubogent stack - VPC, EKS, GPU/CPU node groups, IAM, networking, storage
- Configure Karpenter for GPU-aware autoscaling across instance families (G6e, P4d, P5, Inferentia)
- Implement multi-tenancy namespace isolation, resource quotas, RBAC, network policies, fair-share scheduling
- Build multi-region DR with automated failover, cross-region replication, and failover testing
- Optimize cloud spend - Capacity Blocks, Spot instances, reserved pricing, right-sizing, KubeCost integration
- Design robust network architecture VPC CNI, private subnets, security groups, Transit Gateway, private endpoints
Must-Have Technical Skills :
- AWS infrastructure : deep VPC, IAM, networking, multi-account (5+ years)
- Kubernetes/EKS - production clusters, networking (CNI), storage, RBAC (3+ years)
- Terraform expert large module codebases, remote state, workspaces, CI/CD integration
- Karpenter or Cluster Autoscaler in production
- GPU instances on AWS - G-series (L40S), P-series (A100), NVIDIA GPU operator/device plugins
- Security hardening - Pod Security Admission, OPA/Gatekeeper, image scanning, secrets management
- Linux systems - performance tuning, storage (EBS, EFS, FSx for Lustre), kernel parameters
Core Tech Stack :
- Terraform
- AWS (EKS, EC2 GPU, VPC, IAM, EBS/EFS/FSx, ECR)
- Karpenter
- Helm
- Kustomize
- ArgoCD, NVIDIA GPU Operator/DCGM
- Calico, Istio
- Prometheus/Grafana/KubeCost
- OPA/Gatekeeper
- Falco, Trivy
Benefits :
Why Youll Love Working at Aivar :
- Learn from Experts : Work directly with former AWS leaders and AI pioneers.
- Direct Ownership : Lead high-impact "greenfield" projects from concept to global launch.
- Modern Tech : Master the latest Generative AI frameworks and cloud-native architectures.
- Real-World Impact : Build mission-critical systems used by major global enterprises.
- Rapid Growth : Scale your career quickly in a high-speed
Diversity and Inclusion :
Aivar Innovations is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, gender identity, sexual orientation, religion, disability, age, marital status, caste, or any other protected characteristic, and we are committed to building a diverse, inclusive, and respectful workplace for everyone.
Did you find something suspicious?
Posted by
AIVAR INNOVATIONS PRIVATE LIMITED
Recruiter at Aivar Innovation Pvt. Ltd
Last Active: NA as recruiter has posted this job through third party tool.
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1625847