HamburgerMenu
hirist

Data Engineer II - Apache Spark

4Bell Technology
3 - 5 Years
rupee15-25 LPA
Bangalore

Posted on: 19/03/2026

Job Description

Overview :


We are looking for a skilled Data Engineer II to join our Data Platform team. You will play a key role in building and optimizing our next-generation data infrastructure. Operating at the scale of Flipkart (Petabytes of data), you will design, develop, and maintain high-throughput distributed systems, bridging traditional big data engineering with modern cloud-native and AI-driven workflows.


Key Responsibilities :


Data Pipeline Development & Optimization :


- Build Scalable Pipelines: Design, develop, and maintain robust ETL/ELT pipelines using Scala and Apache Spark/Flink (Core, SQL, Streaming) to process massive datasets with low latency.


- Performance Tuning: optimize Spark jobs and SQL queries for efficiency, resource utilization, and speed.


- Lakehouse Implementation: Implement and manage data tables using modern Lakehouse formats like Apache Iceberg, Hudi, or Delta Lake, ensuring efficient storage and retrieval.


Data Management & Quality :


- Data Modeling: Apply Medallion Architecture principles (Bronze/Silver/Gold) to structure data effectively for downstream analytics and ML use cases.


- Data Quality: Implement data validation checks and automated testing using frameworks (e.g., Deequ, Great Expectations) to ensure data accuracy and reliability.


- Observability: Integrate pipelines with observability tools to monitor data health, freshness, and lineage.


Cloud Native Engineering :


- Cloud Infrastructure: Deploy and manage workloads on GCP DataProc and Kubernetes (K8s), leveraging containerization for scalable processing.


- Infrastructure as Code: Contribute to infrastructure automation and deployment scripts.


Collaboration & Innovation :


- GenAI Integration: Explore and implement GenAI and Agentic workflows to automate data discovery and optimize engineering processes.


- Agile Delivery: Work closely with architects and product teams in an Agile/Scrum environment to deliver features iteratively.


- Code Reviews: Participate in code reviews to maintain code quality, standards, and best practices.


Required Qualifications :


- Experience: 3-5 years of hands-on experience in Data Engineering.


- Primary Tech Stack:


- Strong proficiency in Scala and Apache Spark (Batch & Streaming).


- Solid understanding of SQL and distributed computing concepts.


- Experience with GCP (DataProc, GCS, BigQuery) or equivalent cloud platforms (AWS/Azure).


- Hands-on experience with Kubernetes and Docker.


- Architecture & Storage:


- Experience with Lakehouse table formats (Iceberg, Hudi, or Delta).


- Understanding of data warehousing and modeling concepts (Star schema, Snowflake schema).


Soft Skills :


- Strong problem-solving skills and ability to work independently.


- Good communication skills to collaborate with cross-functional teams.


Education Qualification


- Bachelors or Masters degree in Computer Science, Information Technology, Engineering, or a related quantitative field.


Preferred Qualifications :


- Machine Learning Background: Familiarity with ML concepts, feature engineering, or experience building data pipelines for ML models is highly preferred.


- Experience with workflow orchestration tools (Airflow, Azkaban, etc.).


- Familiarity with real-time analytics databases (Druid, ClickHouse, HBase).


- Experience with CI/CD pipelines for data applications.


Why Join Us ?


- Work on petabyte-scale challenges that define the industry standard.


- Collaborate with top-tier engineers in a high-growth environment.


- Opportunity to work with cutting-edge technologies like Iceberg, K8s, and GenAI.


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in