HamburgerMenu
hirist

Senior Site Reliability Engineer - DevOps/Kubernetes

SPN Globe
Pune
7 - 11 Years

Posted on: 31/10/2025

Job Description

Description :


SPN Globe is a premier firm providing a comprehensive consultancy and staffing solutions for a wide range of domains in IT. We have positioned as a trusted partner for organizations seeking top-tier, niche skills talent with a focus on quality, integrity, and timely delivery. We have successfully navigated the challenges posed by a volatile IT market, consistently expanding its reach. By maintaining a client-first approach and leveraging innovative recruitment strategies, the company has continued to grow steadily in the face of economic fluctuations with win-win approach.

Location : Pune

Joining : 0 to 30 days / Immediate Joiners

Objectives :


- Act as the Site Reliability Engineer for global operations, ensuring system stability, scalability, and efficiency through advanced automation, observability, and proactive infrastructure management.


- Provide expertise in Kubernetes, Linux, networking, and automation practices to support reliable deployments and resilient services.


- Maintain a strong sense of reliability, with clear awareness of the risks and impacts that infrastructure and application changes can have.

Role & Responsibilities :


- Has strong knowledge of Kubernetes (including Talos) for deployment, scaling, and maintaining containerized applications.


- Provides Linux administration expertise and ensures secure, efficient system operations.


- Implements and maintains GitOps workflows using Flux for consistent, automated deployments.


- Designs and manages infrastructure automation using Puppet and Terraform.


- Ensures reliable operation of databases such as MySQL/MariaDB, Yugabyte, and MongoDB, supporting data integrity and availability.


- Operates and integrates streaming platforms (Confluent, Strimzi) for event-driven and real-time processing.


- Develops automation scripts and tools using Python to improve operational efficiency.


- Oversees edge device management, ensuring secure connectivity and smooth lifecycle operations.


- Supports and integrates solutions with Azure and hybrid/multi-cloud environments.


- Builds and operates monitoring and observability systems (Datadog, Prometheus, Grafana) to ensure system health and transparency.


- Designs for scalability and high availability, including disaster recovery and failover strategies.


- Applies security best practices across infrastructure, applications, and data.


- Evaluates risks carefully before changes, ensuring reliable rollout strategies and minimizing downtime or service disruption.


- Monitors system reliability, identifies risks, and implements proactive improvements.


- Collaborates with global teams to share best practices and ensure consistency across environments.


- Defines and standardizes developer tooling (e.g., IDEs, code quality tools, CI/CD integrations) to ensure consistent development environments and maintain high software quality.


- Manages developer workstations and operating system standards (currently Ubuntu-based), ensuring performance, security, and compatibility across the engineering organization with focus on the Asia team.


- Promotes a documentation culture, ensuring clear processes, runbooks, and troubleshooting guides.


- Report to the offshore Digital Manufacturing team based in Switzerland.

- Also, immediately refer this opportunity to your friends since we are closing all positions in this week only.


info-icon

Did you find something suspicious?