Posted on: 03/08/2025
Role : Kafka Developer
Location : Pune , India
Exp : 10- 15 years
Location : Pune, India (with Travel to Onsite)
Experience Required :
10+ years overall, with 5+ years in Kafka-based data streaming development. Must have delivered production-grade Kafka pipelines integrated with real-time data sources and downstream analytics platforms.
Overview :
We are looking for a Kafka Developer to design and implement real-time data ingestion pipelines using Apache Kafka. The role involves integrating with upstream flow record sources, transforming and validating data, and streaming it into a centralized data lake for analytics and operational intelligence.
Key Responsibilities :
- Develop Kafka producers to ingest flow records from upstream systems such as flow record exporters (e.g., IPFIX-compatible probes).
- Build Kafka consumers to stream data into Spark Structured Streaming jobs and downstream data lakes.
- Define and manage Kafka topic schemas using Avro and Schema Registry for schema evolution.
- Implement message serialization, transformation, enrichment, and validation logic within the streaming pipeline.
- Ensure exactly once processing, checkpointing, and fault tolerance in streaming jobs.
- Integrate with downstream systems such as HDFS or Parquet-based data lakes, ensuring compatibility with ingestion standards.
- Collaborate with Kafka administrators to align topic configurations, retention policies, and security protocols.
- Participate in code reviews, unit testing, and performance tuning to ensure high-quality deliverables.
- Document pipeline architecture, data flow logic, and operational procedures for handover and support.
Required Skills & Qualifications :
- Proven experience in developing Kafka producers and consumers for real-time data ingestion pipelines.
- Strong hands-on expertise in Apache Kafka, Kafka Connect, Kafka Streams, and Schema Registry.
- Proficiency in Apache Spark (Structured Streaming) for real-time data transformation and enrichment.
- Solid understanding of IPFIX, NetFlow, and network flow data formats; experience integrating with nProbe Cento is a plus.
- Experience with Avro, JSON, or Protobuf for message serialization and schema evolution.
- Familiarity with Cloudera Data Platform components such as HDFS, Hive, YARN, and Knox.
- Experience integrating Kafka pipelines with data lakes or warehouses using Parquet or Delta formats.
- Strong programming skills in Scala, Java, or Python for stream processing and data engineering tasks.
- Knowledge of Kafka security protocols including TLS/SSL, Kerberos, and access control via Apache Ranger.
- Experience with monitoring and logging tools such as Prometheus, Grafana, and Splunk.
- Understanding of CI/CD pipelines, Git-based workflows, and containerization (Docker/Kubernetes)
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Other Software Development
Job Code
1523464
Interview Questions for you
View All