Posted on: 18/08/2025
Key Responsibilities :
- Design, develop, and maintain large-scale distributed data pipelines using Apache Spark and Scala.
- Write clean, efficient, and maintainable Scala code adhering to industry best practices.
- Implement Spark Core, Spark SQL, and Spark Streaming modules for real-time and batch data processing.
- Collaborate with cross-functional teams to gather and understand data processing requirements.
- Optimize performance of complex queries and processing logic in Hadoop ecosystems.
- Develop and manage workflows using UNIX shell scripting, Hive, Sqoop, and Impala.
- Participate actively in Agile/Scrum ceremonies-daily stand-ups, sprint planning, retrospectives, etc.
- Provide production support and maintenance for existing applications, ensuring high availability and performance.
- Conduct root cause analysis and resolve data pipeline issues in collaboration with upstream/downstream teams.
- Stay updated with the latest Big Data technologies
Mandatory Skills :
- Strong proficiency in Scala with at least 8+ years of hands-on experience.
- Solid experience with Apache Spark for building distributed data processing applications.
- In-depth understanding of data structures, algorithms, and design patterns.
- Strong command over SQL and working knowledge of NoSQL databases.
Did you find something suspicious?
Posted By
Priya C
MANAGING DIRECTOR at LION and ELEPHANTS CONSULTANCY PVT LTD
Last Active: NA as recruiter has posted this job through third party tool.
Posted in
Data Engineering
Functional Area
Big Data / Data Warehousing / ETL
Job Code
1531333
Interview Questions for you
View All