HamburgerMenu
hirist

Job Description

Description :


Role : L3 Support Engineer - HPC

Location : Bangalore, Chennai, Pune and Nodia

Experience : 5 to 10 yrs

Role Summary :

Manages and maintains powerful computing clusters used for complex simulations, data analysis, and scientific research. They ensure optimal performance, user access, software installations, and system security across the HPC environment.

Optimizing large-scale computational environments, supporting complex simulations, and ensuring high performance and reliability

- Maintain HPC clusters, including compute nodes, storage systems, and high-speed interconnects (e.g., InfiniBand).

- Optimize parallel computing workloads for applications such as CFD, CAE, structural analysis, or AI/ML, using any of the tools like MPI, OpenMP, CUDA, or OpenCL.

- Manage and configure job schedulers (e.g., SLURM, PBS, Grid Engine) to ensure efficient resource allocation and workload management.

- Perform system administration tasks, including installation, configuration, and maintenance of Linux-based HPC environments (e.g., RHEL, Ubuntu).

- Troubleshoot and resolve performance bottlenecks in hardware, software, or network components, leveraging tools like Nagios, Ganglia.

- Implement and maintain parallel file systems (e.g., Lustre, GPFS) and storage solutions to support large-scale data processing.

- Collaborate with cross-functional teams, including researchers, data scientists, and software engineers, to optimize application performance and scalability.

- Conduct benchmarking, performance modeling, and tuning to maximize system efficiency for applications like Ansys Fluent, LS-Dyna, or similar.

- Develop and automate scripts/procedures (e.g., Bash, Python, PowerShell) for system monitoring, maintenance, and deployment.

- Provide technical leadership and mentorship to junior engineers, contributing to team growth and knowledge sharing.

- Participate in on-call rotations and occasional travel for equipment maintenance or collaboration with external stakeholders.

- Knowledge on HPC benchmarking tools


info-icon

Did you find something suspicious?