Job Title : Scala Data Engineer

Location : Bangalore (Whitefield) Hybrid (3 days WFO/week)

Experience Range : 8- 15 years

Openings : 2

Notice Period : Immediate joiners or up to 2 weeks max

Job Type : Full-Time

About the Role :

We are looking for an experienced Scala Data Engineer to design, build, and optimize large-scale data solutions. This role requires strong expertise in Scala, Apache Spark, and SQL with hands-on experience in data engineering at scale. You will play a critical role in building reliable data systems, ensuring performance, and mentoring junior engineers.

Key Responsibilities :

- Data Pipelines : Design, develop, and optimize scalable and efficient data pipelines using Scala and Apache Spark.

- Streaming Solutions : Build real-time and near-real-time streaming data pipelines with Kafka, Event Hubs, or Spark Structured Streaming.

- System Architecture : Own the design, architecture, and performance tuning of large-scale distributed data systems.

- Data Integration : Work with cloud-native environments (Azure preferred) and integrate with diverse data sources and formats.

- Quality & Governance : Implement data governance practices including lineage, metadata, and compliance (Alation, Collibra).

- Collaboration : Partner with product, analytics, and engineering teams to ensure timely and high-quality delivery.

- Mentorship : Guide junior engineers, review code, and enforce best practices for clean, maintainable, and scalable solutions.

Mandatory Skills :

- Strong programming experience in Scala (preferred) or Java.

- Hands-on expertise with Apache Spark for big data processing.

- Strong skills in SQL, data structures, and algorithms.

- Experience working on large-scale, production-grade data engineering systems.

- Exposure to cloud-native platforms (Azure preferred; AWS/GCP acceptable).

Preferred Skills :

- Experience with Hadoop ecosystem and streaming frameworks (Kafka, Event Hubs, Spark Streaming).

- Knowledge of Medallion architecture, Parquet, Apache Iceberg.

- Hands-on with orchestration tools (Airflow, Oozie).

- Familiarity with NoSQL databases.

- CI/CD tools such as Git, Docker, Jenkins.

- Exposure to data governance tools (Alation, Collibra).