We are looking for a highly skilled and dedicated Product Support Engineer with deep expertise in Hadoop and Apache Spark to join our team.

In this role, you will be a subject matter expert (SME) in distributed data processing, focusing on optimizing, troubleshooting, and managing big data workloads.

You will play a critical role in ensuring the performance and reliability of our data pipelines and Spark clusters, providing expert-level support and collaborating with various teams.

This position requires flexibility to work in rotational shifts to support 24x7 operations and customer demand.

Key Responsibilities :

- Spark Application Design & Optimization : Design and optimize distributed Spark-based applications, ensuring low-latency, high-throughput performance for complex big data workloads.

- Expert Troubleshooting : Provide expert-level troubleshooting and resolution for any data or performance issues related to Spark jobs and clusters.

- Data Processing Expertise : Work extensively with large-scale data pipelines, leveraging Spark's core components including Spark SQL, DataFrames, RDDs, Datasets, and Structured Streaming.

- Performance Tuning : Conduct deep-dive performance analysis, debugging, and optimization of Spark jobs to significantly reduce processing time and resource consumption.

- Cluster Management & Collaboration : Collaborate effectively with DevOps and infrastructure teams to manage and maintain Spark clusters on various platforms such as Hadoop/YARN, Kubernetes, or cloud platforms (e.g., AWS EMR, GCP Dataproc, Azure HDInsight).

- Real-time Data Processing : Design and implement robust real-time data processing solutions utilizing Apache Spark Streaming or Structured Streaming.

- Rotational Shift Support : Be flexible to work in rotational shifts, based on team coverage needs and customer demand, to comfortably support operations in a 24x7 environment and adjust working hours accordingly.

Required Skills & Experience :

- Expert in Apache Spark : In-depth knowledge of Spark architecture, execution models, and its core components (Spark Core, Spark SQL, Spark Streaming, Spark MLlib, GraphX).

- Data Engineering Practices : Solid understanding and practical experience with ETL/ELT pipelines, data partitioning, shuffling, serialization techniques, and other best practices to optimize Spark jobs.

- Big Data Ecosystem : Strong knowledge of related big data technologies, including Hadoop (HDFS, YARN), Hive, Apache Kafka, and other components of the broader Hadoop ecosystem.

- Performance Tuning and Debugging : Demonstrated ability to effectively tune Spark jobs, optimize query execution plans, and troubleshoot complex performance bottlenecks.

- Experience with Cloud Platforms : Hands-on experience in deploying, managing, and running Spark clusters on leading cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).

- Problem-Solving & Analytical Skills : Excellent analytical and problem-solving skills with a methodical approach to identifying and resolving complex technical issues.

- Communication : Strong verbal and written communication skills, with the ability to articulate technical concepts clearly to diverse audiences.

Good to Have :

- Certifications : Certification in Apache Spark or related big data technologies (e.g., Cloudera, Databricks).

- Data Observability Tools : Experience working with data observability platforms like Acceldata, DataDog, Prometheus, Grafana, or similar tools for monitoring and optimizing Spark jobs.

- Scripting Languages : Demonstrated experience with scripting languages such as Bash, PowerShell, and Python for automation and data manipulation.

- Containerization & Orchestration : Experience with containerized Spark environments using Docker and Kubernetes.

- Cloud Provider Certifications : Possession of certifications from leading Cloud providers (AWS Certified Big Data, Azure Data Engineer Associate, Google Cloud Professional Data Engineer).

- Security Management : Familiarity with concepts related to application, server, and network security management in big data environments