HamburgerMenu
hirist

DevOps Engineer

Talentstack
Multiple Locations
5 - 8 Years

Posted on: 10/12/2025

Job Description

Job Description :

What you'll do :

- Own the cloud, today and tomorrow : Design, build, and operate production infrastructure on AWS (EC2- first) while planning for multi- cloud portability (networking models, IaC abstractions, artifact strategy, secrets, identity).

- GPU platform operations : Stand up and operate GPU fleets for STT/TTS models (AMI images/containers, drivers/CUDA, capacity planning, autoscaling, utilization dashboards, cost controls such as on- demand vs. spot where safe).

- Deploy at speed : Evolve CI/CD (currently Jenkins) for safe, fast, and repeatable releases; enable blue/green and canary patterns; enforce environment parity and automated rollbacks.

- Design for scale & cost : Architect capacity for millions of daily conversations; implement autoscaling, caching, and cost controls (Savings Plans/RIs, Graviton, storage lifecycle). Track unit economics (e.g., $/successful conversation, $/GPU hour).

- Observability & operations : Standardize Prometheus + Grafana and central logging; define SLIs/SLOs; run incident response/on- call with blameless postmortems and runbooks. Extend metrics to GPU health (thermals, ECC, driver) and queue back- pressure.

- Security & compliance (cloud + AI) : Embed security (least- privilege IAM, KMS, VPC segmentation, secret management, image hardening), drive patching/vuln mgmt, and own SOC 2 / ISO 27001 cadence.

- Establish AI compliance practices (model/data governance, retention, dataset access controls, inference isolation) aligned to customer/regulatory needs.

- Single- tenant excellence : Productize per- tenant stacks (templated IaC, parameterized configs, release rings) for repeatability and isolation; ensure data residency and customer- specific controls.

- Manage customer- cloud deployments with secure access patterns and clear SLOs.

- Team leadership & growth : Coach and unblock three junior DevOps engineers; set standards, review designs, and hire for scale.

- Help shape the function into pods over time: Security, R&D/Tooling, FinOps, and Core Ops.

- Sales partner : Support RFPs/presales with cloud architecture, BoQs, security responses, and workshops; communicate trade- offs clearly to enterprise architects and CISOs.

info-icon

Did you find something suspicious?