HamburgerMenu
hirist

Kale Logistics - Senior Site Reliability Engineer - CI/CD

Kale Logistics Solutions Pvt.Ltd
6 - 10 Years
Pune

Posted on: 26/02/2026

Job Description

Description :

Join Kale Logistics Solutions :

Incorporated in 2010, Kale Logistics Solutions is a trusted global cloud-based tech provider for several Fortune 500 companies worldwide, offering a comprehensive suite of tech solutions for the logistics industry.

With in-depth domain knowledge and technical expertise, Kale has created a suite of comprehensive enterprise systems and Cargo Community Platforms, which offer a single electronic window capable of supporting operational flows, percolating data to various stakeholders, and facilitating the paperless exchange of trade-related information between stakeholders.

Kales community and enterprise solutions cater to a wide network of Logistics Service Providers (LSPs) and help strengthen and improve their operational and business capabilities.

With offices in India, UAE, Kenya, Netherlands, and North America with 5,500+ clients worldwide across 40 countries, Kale Logistics Solutions is a major player in the industry.

About the Role :

We are looking for a highly skilled Senior Site Reliability Engineer (SRE) to join our engineering organization.

As a senior member of the team, you will play a key role in designing, building, and operating highly scalable, reliable, and secure systems across cloud and on-prem environments.

You will partner closely with product engineering, DevOps, security, and platform teams to drive reliability, improve developer velocity, and operational excellence.

This role requires hands-on experience with large-scale distributed systems, deep expertise in automation and infrastructure engineering, and a passion for reducing toil through code.

What Youll Do :

Reliability & Performance :

- Ensure availability, resilience, scalability, and performance of production systems.

- Define, implement, and enforce SLIs, SLOs, and error budgets.

- Conduct capacity planning, load testing, and performance tuning.

Automation & Operations Engineering :

- Automate manual operational tasks via tooling, scripts, and platform services.

- Develop infrastructure as code (IaC) for cloud and on-premise environments.

- Implement CI/CD improvements and production-safe rollout strategies (blue/green, canary, feature toggles).

Observability & Monitoring :

- Build, manage, and improve logging, metrics, tracing, and alerting.

- Implement proactive monitoring strategies to detect issues before they impact customers.

- Own incident management processes including postmortems and runbooks.

Security & Compliance :

- Integrate security controls into pipelines and runtime environments.

- Enforce least-privilege access, secret management, and vulnerability remediation.

- Partner with SecOps to ensure compliance in regulated environments.

Collaboration & Coaching :

- Work daily with engineering and DevOps teams to improve system reliability.

- Mentor junior team members on design, reliability, cloud systems, and operational excellence.

- Advocate SRE principles across engineering teams.

Incident Response & Continuous Improvement :

- Lead incident triage and recovery.

- Drive blameless post-incident reviews and systemic fixes.

- Reduce MTTR through tooling, automation, and resilient architectures.

Who You Are :

- 6 - 10+ years of experience in SRE/Systems Engineering roles.

- Expertise in Linux-based systems and distributed architectures.

- Proficiency in one or more programming/scripting languages : Python, Go, Bash, Java, or similar.

- Hands-on experience with : Kubernetes (managed or self-hosted on-prem), Docker and container ecosystems.

- Infrastructure automation tools : Terraform, Helm, etc.

- CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, Azure DevOps, etc.

- Cloud experience with at least one major provider (AWS / Azure / GCP).

Strong understanding of :

- Networking concepts (DNS, load balancers, VPC, firewalls, NAT, routing).

- Observability stacks (Prometheus/Grafana, ELK, Splunk, OpenTelemetry, New Relic, Datadog).

- Experience running production systems at scale.

Preferred :

- Experience with on-prem infrastructure, VMware, or hybrid-cloud environments.

- Database reliability knowledge (PostgreSQL, MySQL, NoSQL-Mongo, caching systems).

Experience with :

- Distributed messaging (Kafka, RabbitMQ, SNS/SQS, etc.

- Zero downtime deployments.

Background in :

- FinOps optimization.

- Resiliency patterns (circuit breakers, retries, autoscaling).

- Certification(s) in cloud platforms or Kubernetes.

Why Join Us :

- Empowerment and Growth : We provide opportunities for continuous learning and development to help you perform at your best.

- Inclusive Culture : We celebrate diversity and create an inclusive environment where everyone feels valued and respected.

- Innovation : Be part of a team that is driving innovation in the logistics industry with cutting-edge technology solutions.

- Global Impact : Work on projects that have a significant impact on global trade and logistics, contributing to the efficiency and sustainability of the industry.


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in