Posted on: 12/02/2026
Description :
Job Title : Senior Data Engineer (Druid & Real-Time Systems)
Experience : 7+ Years
Type : Full-Time
Role Overview :
We are seeking a seasoned Senior Data Engineer to lead the architecture, deployment, and optimization of our high-performance analytics platform.
The core of this role involves managing massive-scale Apache Druid clusters to deliver sub-second OLAP queries.
You will be the bridge between raw data streams and actionable insights, building robust pipelines that integrate modern data lakehouses with real-time streaming technologies.
Key Responsibilities :
- Druid Architecture & Optimization : Design, deploy, and manage highly available Apache Druid clusters. Perform deep-dive performance tuning (compaction, indexing, caching) to ensure sub-second query latency on petabyte-scale datasets.
- Pipeline Engineering : Architect and maintain end-to-end data ingestion pipelines. This includes real-time streaming via Kafka and Spark Structured Streaming, as well as complex batch processing using Airflow.
- Cloud Infrastructure : Manage Druid deployments on AWS (EKS). Handle scaling, resource allocation, and cost optimization across EC2, S3, and Kubernetes environments.
- Ecosystem Integration : Build seamless data bridges between Druid and our broader data ecosystem, including Snowflake, Databricks (Delta Lake), and various Data Lakes.
- Observability & Security : Implement rigorous monitoring, alerting, and logging for distributed systems. Ensure data governance and security through IAM roles, encryption, and VPC configurations.
- Technical Leadership : Act as a subject matter expert on OLAP and distributed systems, guiding the team on best practices for columnar storage and high-concurrency query patterns.
Technical Requirements :
Must-Have Skills :
- Core Engine : Expert-level hands-on experience with Apache Druid (Deep storage, MiddleManagers, Brokers, and Historicals).
- Streaming & Compute : Mastery of Kafka and Spark Structured Streaming for low-latency data processing.
- Cloud & Orchestration : Strong proficiency in AWS (EC2, S3, IAM) and Kubernetes (EKS) for containerized deployments.
- Storage & OLAP : Deep understanding of distributed systems, columnar storage formats, and OLAP cubes.
- Programming : Fluent in Python, Scala, or Java.
- Modern Data Stack : Hands-on experience with Snowflake and Databricks (PySpark, Delta Lake).
Preferred Qualifications :
- Experience with BI tool integration (e.g., Looker, Superset, or Tableau) specifically optimized for Druid.
- Contributions to open-source projects (Druid, Spark, or Kafka).
- Knowledge of Infrastructure as Code (Terraform or CloudFormation).
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1612366