HamburgerMenu
hirist

Job Description

Experience : 6 to 8 years.

Location : Bangalore.


Required Skills :


- Strong hands-on experience with Apache Spark using Scala, Hive, and HDFS.


- Proficiency in Oozie workflows, ScalaTest, and Spark performance tuning.


- Deep understanding of Spark UI, YARN logs, and debugging distributed jobs.


- Working knowledge of CI/CD pipelines, GitHub, Maven, and Nexus.


- Ability to write unit tests, and follow best practices for scalable data pipelines.

Key Skills :


- Apache Spark (with Scala).


- Scala language expertise.


- Apache Hive.


- HDFS and Hadoop Ecosystem.


- Oozie workflow orchestration.

Preferred Qualifications :


- 68 years of experience working on Big Data platforms.


- Hands-on performance optimization and memory tuning for Spark jobs.


- Familiarity with Agile/Scrum methodologies.


- Experience working with enterprise-level distributed data systems.


- Strong problem-solving and analytical skills.


What You'll Do :

Spark Development : Design, develop, and maintain robust and scalable data processing applications using Apache Spark with Scala.


Data Orchestration : Implement and manage complex data workflows using Oozie.


Performance Tuning : Conduct deep performance optimization and memory tuning for Spark jobs, leveraging your understanding of Spark UI and YARN logs for effective debugging of distributed jobs.


Data Storage & Querying : Work proficiently with Apache Hive for data warehousing and HDFS for distributed storage within the Hadoop Ecosystem.


Quality Assurance : Write comprehensive unit tests using ScalaTest and adhere to best practices for building scalable and reliable data pipelines.


CI/CD & Version Control : Utilize working knowledge of CI/CD pipelines, GitHub, Maven, and Nexus for continuous integration, delivery, and version control.


Troubleshooting : Diagnose and resolve complex issues in distributed data systems, ensuring data accuracy and pipeline stability.


Collaboration : Work within an Agile/Scrum environment, collaborating with cross-functional teams to deliver high-quality data solutions.


info-icon

Did you find something suspicious?