HamburgerMenu
hirist

Job Description

Role : HPC System Engineer

Full Time

Location : Hyderabad (REMOTE)

Notice Period : 30 Days


Job Description :


Responsibilities :


- Administration of HPC and VDI clusters

- User Account management for HPC onboarding and offboarding

- Creation and Maintenance of AMI Images in AMI accounts

- Install, configure, and maintain Linux operating systems on HPC clusters

- Support HPC necessary components and native services of the platform by coordinating with respective providers e.g., EFPortal, AWS RES, CycleCloud, AWS Parallel Cluster, etc.


- AWS Managed Active Directory support and Management

- Continuous upgrades to the HPC platform and related components OS, Java, Python, EFPortal, etc.

- Implement and maintain necessary compliance controls i.e., US Export Control, Confidentiality

- Conduct regular audits, share the findings and implement corrective actions as required

- Co-ordinate with other teams like v-drive team in testing and migrating/installing engineering applications to the platform

- Manage job schedulers such as Slurm or LSF

- Utilize node provisioning tools like Werewolf

- Troubleshoot system issues and provide technical support to users

- Monitor system performance and ensure optimal operation of the HPC environment

- Collaborate with other IT professionals to integrate new technologies into the existing infrastructure

- Progressive experience in HPC system administration, preferably in a Redhat/CentOS Linux environment

- AWS Cloud formation templates to build infrastructure for HPC and storage Amazon FSx for Netapp and Lustre

- Experience with parallel file systems and storage solutions

- Strong knowledge of job schedulers such as Slurm or LSF

- Familiarity with node provisioning tools like Werewolf

- Proficiency in Linux OS administration

- Knowledge of job scheduling tools (e.g., Slurm)

- Understanding of node provisioning tools (e.g., Werewolf)

- Excellent problem-solving abilities

- Linux+ certification preferred

- Top Secret Clearance : TS/SCI preferred

- On-site presence at customer location in Stennis, MS

- Availability for some on-call/weekend work

- Hands on experience setting up HPC compute cluster

- Setup PBS job scheduler and supporting PBS servers

- Experience with Redhat and Rocky Linux; bash scripting

- Nice to have Docker, Kubernetes experience

- Nice to have Storage knowledge

- Nice to have networking and devops knowledge

Qualifications :

Minimum Qualifications / Skills :


- Bachelor's Degree required

- Preferably in Computer Science, Information Systems, or related field

Preferred qualifications / Skills :


- Very good written and presentation / verbal communication skills with experience of customer interfacing role

- In-depth requirement understanding skills with good analytical and problem solving ability, interpersonal efficiency, and positive attitude

Must have Experienced :


- Administration of HPC and VDI clusters

- Deployed and configured AWS Parallel Cluster for HPC workload orchestration with CFT

- Deployed and configured AWS Managed active directory

- Provisioned Amazon Storage FSx for NetApp and Lustre with HPC


info-icon

Did you find something suspicious?