Description :

- Own end-to-end technical design and architecture of batch and streaming data pipelines

- Lead development of high-throughput, low-latency data processing systems using Java

- Design and implement real-time streaming solutions (event ingestion, processing, aggregation)

- Drive code quality, performance optimization, and scalability across systems

- Review code, mentor engineers, and set engineering best practices

- Collaborate with product, platform, infra, and security teams to deliver production-grade solutions

- Ensure data reliability, fault tolerance, and observability (metrics, logs, alerts)

- Automate operational workflows and deployment pipelines

- Participate in capacity planning, cost optimization, and system tuning

- Stay current with evolving Big Data and streaming ecosystems and evaluate new technologies

Required Skills & Experience :

Core Technical Skills :

- Strong proficiency in Java (multithreading, concurrency, JVM tuning)

- Hands-on experience with distributed data processing frameworks :

1. Apache Spark (Core, SQL, Streaming / Structured Streaming / Apache Flink)

2. Apache Kafka (producers, consumers, partitions, offsets, exactly-once semantics)

- Solid understanding of batch + streaming architectures (Lambda / Kappa patterns)

- Experience with Hadoop ecosystem components (HDFS, Hive, YARN or equivalents)

Data & Storage :

- Strong knowledge of relational databases (PostgreSQL, MySQL)

- Experience with NoSQL / distributed datastores (HBase, Cassandra, MongoDB, Pinot etc.)

- Understanding of data modeling, partitioning, and schema evolution

Platform & Operations :

- Experience with Linux-based systems and production deployments

- Exposure to containerization and orchestration (Docker, Kubernetes preferred)

- Familiarity with monitoring and observability tools (Grafana, Prometheus, ELK, etc.)

- Experience with CI/CD pipelines and automated testing frameworks

Leadership & Soft Skills :

- Proven experience leading technical teams and driving delivery

- Strong problem-solving and debugging skills in complex distributed systems

- Ability to take ownership of critical systems and make architectural decisions

- Excellent communication skills to work with cross-functional stakeholders

- Comfortable working in a fast-paced, evolving data ecosystem

Good to Have (Plus Skills) :

- Experience with Flink / Kafka Streams / real-time analytics

- Exposure to cloud-native data platforms (AWS, GCP, Azure)

- Knowledge of data governance, security, and access control

- Experience in telecom, fintech, or large-scale consumer data platforms

Why This Role :

- Work on large-scale, real-time data systems

- High technical ownership and architectural influence

- Opportunity to shape next-generation streaming and analytics platforms