Posted on: 26/08/2025
About The Role
ThoughtSpot is an AI-powered analytics platform that enables users to explore and analyze data through natural language queries, making insights accessible to all.
Our mission is to deliver reliable, high-performing applications that empower our customers.
We are seeking a Site Reliability Engineer who excels at providing technical support for our end users , incident management and resolution, and cloud operations within a customer-centric environment.
Role Overview :
We are seeking a Site Reliability Engineer (SRE) with a strong focus on customer-facing technical support.
What You'll Do :
Technical & Product Support :
- Serve as the first line of support for customer-reported technical issues related to our SaaS platform.
- This involves data connectivity issues, report errors, performance concerns, access problems, data inconsistencies, software bugs, integration challenges etec.
- Understand and empathize with the challenges ThoughtSpot users face, offering tailored solutions to improve their user experience.
- Ensure prompt and accurate updates, meet SLAs and provide timely resolution to customer issues via tickets and calls.
- Create knowledge-base articles to document knowledge and help customers self service.
System Reliability & Monitoring :
- Maintain, monitor, and troubleshoot ThoughtSpot cloud infrastructure.
- Monitor system health and performance through metrics, logs, and dashboards using tools like Prometheus, Grafana, to detect and prevent issues early.
- Work with Engineering teams to define, and implement tools to enhance debuggability, supportability, availability, scalability, and performance.
- Be an expert in cloud and on-premise infrastructure by developing automation and best practices.
- Participate in on-call rotation for critical SRE systems, lead the incident review and root cause analysis.
What You'll Bring :
- Exceptional communication skills, both written and verbal, to effectively engage with cross-functional teams, customers, and stakeholders.
- Relevant work experience troubleshooting complex Linux Systems and managing distributed systems.
- Experience in virtualization and Cloud technologies.
- Experience in enterprise customer support, on-call rotation for critical SRE systems, leading incident review and root cause analysis.
- Ability to diagnose technical problems and work with Engineering on escalated issues.
- Strong problem solving skills, algorithmic thinking and a strong foundation in how systems should work.
- Understanding of tools & frameworks required to Operate and manage Cloud infrastructure.
- Strong customer service skills.
- Solid communication skills and ability to work independently.
- Ability to leverage automation, monitoring and data analysis to ensure high availability.
- Familiarity with scripting languages such as Python, JavaScript or Bash.
- Exposure to infrastructure and service monitoring tools.
Ideal Candidate Profile :
You thrive in dynamic, customer-facing environments and are passionate about ensuring system reliability and customer satisfaction.
What We Offer :
- Competitive salary and benefits package.
- Opportunities for professional growth and career advancement.
- A collaborative work environment where your input and expertise directly impact our customer experience.
If youre ready to leverage your technical skills in a role that directly influences customer success and BI user satisfaction, wed love to hear from you.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1536144
Interview Questions for you
View All