The Role :
Tech Leads are the connective tissue of engineering delivery the engineers who translate product intent into technical plans, manage cross-team dependencies, hold the quality bar sprint after sprint, and keep the team unblocked without becoming the bottleneck themselves.
- Own sprint-level technical planning: scope, dependencies, risk identification, and realistic estimation
- Hands-on : you write production code and lead code reviews you are an engineer, not a coordinator
- First escalation path for technical blockers: you unblock the team, you don't re-route the blockage
- Cross-functional bridge: you speak both engineering and product fluently neither side feels lost in translation
Core Responsibilities :
Technical Planning & Execution :
- Lead sprint planning: translate product requirements into engineering tasks with clear acceptance criteria, estimated complexity, and identified dependencies
- Own the squad's technical design for delivery scope: API contracts, data model decisions, service integration points documented before coding starts
- Track delivery progress: identify risks early (unclear requirements, infra dependencies, third-party API uncertainties), escalate to Staff/Principal before they become blockers
- Manage tech debt visibility: log, prioritise, and negotiate tech debt sprints ensure the team is not indefinitely accumulating debt while shipping features
Hands On Engineering :
- Write production-grade backend code : APIs, event consumers, data pipelines, and integration adapters your code is reviewed, approved, and deployed
- Lead code reviews : evaluate correctness, test coverage, error handling, observability, and design quality your reviews are educational, not perfunctory
- Own the squad's service health: build runbooks, define alert thresholds, participate in on-call, and drive RCA after incidents
- Validate performance before release: load test new services, profile latency-sensitive paths, and confirm p99 SLAs are met before launch
AI-Powered Feature Delivery :
- Coordinate ML model integration delivery : work with Data Scientists to define API contracts, implement model endpoint calls, build fallback logic, and instrument prediction logging
- Deliver LLM-powered features : RAG API integration, conversational flow state management, output validation, and error surface handling
- Ship Voice AI features : co-ordinate between ASR/TTS services, intent API, and booking flow ensuring low-latency end-to-end spoken booking
- Own A/B experiment infrastructure delivery for your squad: feature flag integration, metric collection, and experiment configuration
The Hard Engineering Problems You'll Face :
- Across all six platforms, the engineering challenges are real, non-trivial, and consequential:
Cache Invalidation at Speed :
- Fare data has a 30-second freshness window.
- A stale cache hit in the booking flow means a pricing error, a failed checkout, or a lost trust signal.
- Multi-tier cache design (L1/L2/L3), TTL strategies, event-driven invalidation via Kafka, and cache stampede prevention are all live problems.
Distributed Concurrency :
- Train Tatkal opening: millions of concurrent writes for 72 berths per coach.
- Optimistic locking, distributed lease management, queue-based fairness, and atomic seat allocation without deadlock under pathological load.
Event Ordering Guarantees :
- A booking event must arrive before its payment event.
- But Kafka doesn't guarantee cross-partition ordering.
- Building booking state machines with idempotency, deduplication, and out-of-order event tolerance is a continuous engineering challenge.
Multi-Tenancy Blast Radius :
- A B2B enterprise client's policy engine change must not affect the B2C booking flow.
- Multi-tenant isolation in shared infrastructure (API gateways, Kafka topics, DB schemas, cache namespaces) must be designed from day one.
AI Model Integration :
Serving a ranking model in the search critical path at p99 <20ms requires GPU node management, model warmup, request batching, async inference patterns, and fallback to heuristic ranking when the model is unavailable.
AI-First Engineering Mandate :
- Platform Engineers at every level are responsible for building systems that AI and ML can run on and increasingly, systems that are AI themselves.
ML Serving Infrastructure : your APIs must serve model predictions at p99 <20ms with graceful fallbacks you design the latency budget allocation
Feature Pipeline Engineering : real-time feature computation (Kafka Streams, Flink) feeding the feature store at sub-second freshness
RAG Backend Systems : vector store integration, embedding generation pipelines, document chunking and indexing for knowledge retrieval
Agentic Workflow Infrastructure : durable execution systems (Temporal) for multi-step LLM agent workflows with retry and compensation logic
Voice AI Backend : ASR request routing, low-latency TTS pipelines, spoken intent API design for conversational booking flows
Recommendation API Design : serving infrastructure for collaborative filtering, session-based models, and personalised ranking endpoints
Price Intelligence Pipelines : real-time competitive price ingestion, fare change event streaming, lower-price guarantee trigger systems
A/B Experiment Infrastructure : feature flags, traffic splitting, metric collection, and experiment configuration systems
MCP Tool Orchestration : building the tool-use APIs that LLM agents call to execute booking, modify, and cancel operations safely
Who You Are :
- 7 to 10 years in backend engineering with 12 years in a technical lead, delivery lead, or senior IC role with cross-team coordination experience
- Comfortable holding both technical depth (system design, code review) and delivery accountability (planning, risk management, timeline ownership)
- Strong written communicator: clear design specs, effective sprint retrospectives, honest status reports
- Strong in Java/Kotlin; familiar with distributed systems fundamentals, REST, Kafka, and Redis
- Tier-I institute preferred (IIT / IIIT / NIT / IISC / BITS CSE / ISE)
Technology Stack :
Backend : Java, Kotlin, Spring Boot, Ktor
Systems : Kafka, Redis, REST, gRPC, MySQL/DynamoDB
Cloud : AWS (EKS, EC2, S3, RDS), Kubernetes, Docker
Tooling : CI/CD pipelines, Feature flags, Monitoring dashboards, Load testing (k6)
Did you find something suspicious?