HamburgerMenu
hirist

Data Engineer

EQUITY DATA SCIENCE INDIA PRIVATE LIMITED
2 - 6 Years
Mumbai

Posted on: 25/03/2026

Job Description

Job Title : Data Engineer Data



Hybrid : 3 days WFO in a week



Office Location : Mumbai - Commerz II - Goregaon(East)



Work Timings : 03 : 00 pm to midnight



Our customers and AI agents are pushing the limits of modern data warehouse evolution. Be part of it. You'll contribute to architecting the next generation of our data platform balancing real-time serving, analytical queries (<5s aggregates), and streaming ingestion, all while serving institutional investors analyzing markets 10+ hours daily.



EDS Track Record : 8 years profitable. Northern Trust partnership (investor & channel). SOC 2 Type II certified. Tier-1 hedge fund clients. Transparent runway reviewed in town halls. Leadership that still writes code.



The Role in Reality :



Current State : Snowflake data warehouse with batch processing, dbt transformations, Redis caching layer.



Core Ownership : This role owns denormalized materialized datasets end-to-end from contract safe inputs to deterministic, low-latency outputs.



What You'll Build (Future Direction) :



- Streaming pipelines (Kafka/Flink/Snowflake) replacing batch processes



- Enhanced Snowflake lakehouse with contract-enforced stage promotion (raw/validated/modeled)



- Redis-fronted serving layer backed by ClickHouse/StarRocks for real-time queries



- Expanded governance with JSON Schema contracts, Great Expectations validation, Airflow orchestration.




Latency & SLOs (Truthful) :



- Hot path (Redis over ClickHouse) : p95 <100ms cache hits; <700ms cache-miss



- Historical analytics (Snowflake) : p95 <5s for defined aggregates; ad-hoc queries out of SLO



- Ingestion (Kafka/Flink/validated) : p95 <2m end-to-validated



- Error budget : 0.5% monthly (~3.6 hours downtime allowed)




How We Ship :


- Each pipeline ships with SLIs, SLOs, runbook, and an error budget. Canary every DAG change; rollback via table version. All infra through Terraform / AWS CDK; PRs require docs, owners, tests; backfills are first-class with quotas.


- You'll help us evolve from reliable batch to real-time streaming while maintaining production stability.



- We evaluate work samples, not algorithmic puzzles. Our engineering culture values sustainable urgency ship quickly but thoughtfully.



Required Experience (Be Ready to Discuss Specific Examples) :


Data Platform Engineering :


- Snowflake at scale : Share a query optimization with before/after metrics and cost impact



- Batch & streaming concepts : Describe any streaming/real-time data challenge you've solved (any technology)



- Performance optimization : Explain how you improved data pipeline latency or throughput in production.




Reliability & Operations :



- Data quality : Walk through data validation or quality controls you've implemented



- Incident response : Detail a data incident from detection to resolution



- Observability : Show monitoring/alerting you created that prevented customer impact.




Technical Fundamentals :



- SQL optimization : Complex query you rewrote for significant performance improvement



- Python at scale : Data processing code you've optimized for memory/speed



- Data engineering patterns : Implementation of idempotency, deduplication, or consistency in production




Interview focus : We'll explore your production experiences and how they apply to our evolution.



Preferred Experience :



- Infrastructure as Code : Terraform / AWS CDK experience for data platform provisioning



- Data contracts & governance : Experience with schema registries, data catalogs, or lineage tools



- CDC patterns : Debezium, Kafka Connect, or similar change data capture at scale



- Cost optimization : Track record of reducing data platform costs while maintaining performance



- Financial domain : Understanding of market data, trading systems, or investment workflows.




Our Stack :



- Current : Snowflake (primary), PostgreSQL, Redis, dbt, Python/FastAPI, Datadog, AWS



- APIs : REST for data access



- Building Toward : Kafka/Flink streaming, Airflow orchestration, ClickHouse/StarRocks serving layer



You'll help architect our transition from batch to streaming while maintaining reliability.



What Distinguishes Senior vs Mid (Measurable) :



- Senior : Owns an SLO, authors an RFC, lands a phase-gate migration



- Mid : Lands 2 DAGs + 1 DQ suite within guardrails



- Both levels contribute meaningfully we hire for capability, not just seniority.



Leadership You'll Work With :


In 2012, Sandeep Varma was a quant analyst at a hedge fund when Steve Galbraith Morgan Stanley's former CIO handed him a framework that would change everything. For six years, Sandeep built and refined this platform in the trenches. When he met Greg McCall, a seasoned PM and author of The Monopoly Method, they shared a vision : create the unified platform investment managers actually need.



CTO Stan runs coding office hours and contributes to architecture. Chief Architect Naofumi codes alongside the team while designing systems. Chief Data Scientist Ben actively builds full stack, including AI agents. They understand production pressure because they live it.



This isn't technical leadership that only reviews PRs they write them.



Growth Path (Realistic) :



- Months 1-3 : Own scope : d pipelines, implement monitoring, shadow on-call



- Months 4-6 : Expand to critical components, co-define SLIs, join incident response



- Months 7-12 : Shape data consistency strategies, present architecture to clients



- Year 2+ : Platform Architect or Tech Lead earned through demonstrated impact


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in