Posted on: 23/01/2026
Description :
This requires a relentless focus on eliminating manual processes. You will also leverage our monitoring platform to improve the overall customer experience by systematically identifying and fixing any issues impacting our customers. As an SRE, you will also help diagnose issues on the platform, leveraging a deep understanding of the SingleStore query engine along with the backend infrastructure.
Roles And Responsibilities :
- Optimize telemetry platform to identify customer impacting events while providing relevant data to drive debugging
- Partner with engineering team to optimize performance of services for cloud architecture
- Debug Live Site events and conduct follow-up postmortem and RCA analysis
- Participate in an SLA-driven on-call rotation, which will include after-hours, weekend, and rotating holiday participation.
Required Skills And Experience :
- Infrastructure automation experience. Scripting experience (Python, Bash) required.
- Experience with the Prometheus monitoring stack. Experience with Grafana, Mimir and Loki is a plus.
- Knowledge of Kubernetes and the container ecosystem
- Strong cross group collaboration and communication skills
- Experienced with at least one of AWS, Azure, or Google Cloud
- Experience debugging, diagnosing and troubleshooting complex, production software
- Experience with on-call work and incident response
- B.S. Degree in Computer Science or related field
SingleStore is a global database company that empowers the worlds leading organizations to build and scale cutting-edge AI applications on a unified data platform that supports real-time transactions, analytics, and search. Our platform handles streaming data ingestion, vector search, full-text search, and multi-model data types - all with high performance, petabyte-scale capacity, high user concurrency, and low latency.
Did you find something suspicious?
Posted by
Lachi K.
Senior Talent Acquisition Partner at SINGLESTORE INDIA PRIVATE LIMITED
Last Active: 27 Jan 2026
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1605252