Posted on: 08/08/2025
Responsibilities :
- You will provide leadership in designing and implementing groundbreaking GPU computers that run demanding deep learning, high-performance computing, and computationally intensive workloads.
- We seek an expert to identify architectural changes and/or completely new approaches for accelerating our deep learning models.
- As an expert, you will help us with the strategic challenges we encounter, including compute, networking, and storage design for large-scale, high-performance workloads, effective resource utilization in a heterogeneous computing environment, evolving our private/public cloud strategy, capacity modelling, and growth planning across our products and services.
- As an architect, you are responsible for converting business needs associated with AI-ML algorithms into a set of product goals covering workload scenarios, end-user expectations, compute infrastructure, and time of execution; this should lead to a plan for making the algorithms production-ready.
- Benchmark and optimise the Computer Vision Algorithms and the Hardware Accelerators for performance and quality KPIs.
- Optimize algorithms for optimal performance on the GPU tensor cores.
- Collaborate with various teams to drive an end-to-end workflow from data curation and training to performance optimization and deployment.
- Provide technical leadership and expertise for project deliverables.
- Leading, mentoring, and managing the technical team.
Requirements :
- MS or PhD in Computer Science, Electrical Engineering, or related field.
- A strong background in the deployment of complex deep learning architectures.
- 5+ years of relevant experience in at least a few of the following relevant areas is required in your work history: Machine learning (with focus on Deep Neural Networks), including an understanding of DL fundamentals.
- Experience adapting and training DNNs for various tasks.
- Experience developing code for one or more of the DNN training frameworks (such as Caffe, TensorFlow, or Torch): Numerical analysis, Performance analysis, Model compression and Optimization, and Computer architecture.
- Strong Data structures and Algorithms know-how with excellent C/C++ programming skills.
- Hands-on expertise with PyTorch, TensorRT, and CuDNN.
- Hands-on expertise with GPU computing (CUDA, OpenCL, OpenACC) and HPC (MPI, OpenMP).
- In-depth understanding of container technologies like Docker, Singularity, Shifter, and Charliecloud.
- Proficient in Python programming and bash scripting.
- Proficient in Windows, Ubuntu, and CentOS operating systems.
- Excellent communication and collaboration skills.
- Self-motivated and able to find creative, practical solutions to problems.
Did you find something suspicious?