Posted on: 28/04/2026
Ready to be a Titan?
ServiceTitan is the leading software platform for the trades, powering the businesses that keep the world running. We are seeking a Director of Infrastructure Software Engineering with a focus on SQL Reliability & Observability to lead a high-impact engineering organization at the intersection of Infrastructure, SRE, and Product Engineering. This role is critical to improving uptime, customer experience, and cost efficiency across our platform. You will execute the strategy to transform our systems from reactive operations to proactive, data-driven reliability engineering - while building and leading a world-class team in Bangalore.
What You'll Do :
- Help drive the vision for SQL reliability, data platform stability, and observability across the platform
- Define SLOs/SLIs, error budgets, and reliability frameworks that tie directly to business outcomes
- Build a roadmap for end-to-end observability spanning metrics, logs, and traces
- Oversee performance, scaling, and availability across Azure SQL, PostgreSQL, MySQL, Cosmos DB, MongoDB, Redis, and Kafka
- Drive meaningful improvements in query performance, optimization, capacity planning, HA/DR, and replication strategies
- Lead development of a unified observability platform leveraging OpenTelemetry
- Enable end-to-end correlation from Mobile ? API ? Backend ? Database
- Reduce alert noise, improve signal quality, and deliver actionable dashboards and real-time insights
- Implement synthetic monitoring across web and mobile surfaces
- Introduce AI-driven anomaly detection and lead development of SRE Agent / AI-assisted debugging systems
- Hire and develop a team of SQL/DB Reliability Engineers, Observability Engineers, and SRE/Automation/AI Engineers
- Establish an engineering culture grounded in ownership, quality, and impact
- Mentor senior engineers and cultivate the next generation of technical leaders
- Partner with Infrastructure & Platform, Product Engineering, and Finance teams
- Align reliability, performance, and cost goals with broader business objectives
What You'll Bring :
- 10 - 15+ years of experience in software engineering, SRE, or infrastructure roles
- 5+ years in engineering leadership at the Manager or Director level
- Proven track record managing large-scale distributed systems and databases
- BS or MS in Computer Science or equivalent experience
- Deep expertise in SQL and relational database systems
- Strong hands-on experience with cloud platforms, Azure preferred
- Solid understanding of observability platforms (metrics, logs, tracing), Kubernetes, and performance engineering
- Familiarity with NoSQL systems (Cosmos DB, MongoDB, Redis, Kafka), OpenTelemetry, and CI/CD reliability practices
- Exposure to AI/ML-driven monitoring or root cause analysis systems is a plus
Did you find something suspicious?
Posted by
Recruiter
Last Active: NA as recruiter has posted this job through third party tool.
Posted in
DevOps / SRE
Functional Area
Senior Management
Job Code
1631794