Posted on: 23/04/2026



Role Overview :
As a Consultant within the Dell AI & Data CoE, you will lead the architecture and implementation of large-scale AI platforms for our most strategic global customers. You are not just a builder; you are a technical visionary who helps clients navigate the complexities of the Dell AI Factory with NVIDIA and OSS.
You will bridge the gap between Dells world-class hardware (PowerEdge, PowerScale, PowerSwitch) and the advanced software orchestration layers (NVAIE, Kubernetes, Slurm) required to turn raw silicon into business value.
Key Responsibilities :
CLIENT ADVISORY & ARCHITECTURAL DESIGN :
- Lead technical workshops to design Sovereign AI and Private Cloud AI platforms using Dell Validated Designs (DVD).
- Act as a Subject Matter Expert (SME) on the integration of NVIDIA AI Enterprise (NVAIE) with Dell PowerEdge XE servers (H100/H200/B200).
- Develop high-level and low-level designs (HLD/LLD) that incorporate GPU/Network Operators and high-speed InfiniBand/RoCE fabrics.
ADVANCED ORCHESTRATION & HPC INTEGRATION :
- Deploy and optimize Red Hat OpenShift and upstream Kubernetes in air-gapped or hybrid-cloud enterprise environments.
- Implement advanced workload scheduling and fractional GPU slicing using Run :ai or Slurm to maximize client ROI on hardware.
- Guide customers in choosing and implementing the right orchestration layer (e.g., BCM for bare metal vs. Kubernetes for microservices).
MLOPS ECOSYSTEM DELIVERY :
- Architect end-to-end MLOps pipelines utilizing Kubeflow, MLflow, or ClearML to streamline the "data-to-model" lifecycle.
- Enable distributed training and fine-tuning (LLMs/GenAI) for clients using Ray and PyTorch on Dell infrastructure.
- Integrate Rafay for clients requiring decentralized or multi-cluster AI management across edge and core data centres.
PRACTICE DEVELOPMENT & THOUGHT LEADERSHIP :
- Contribute to the CoE by developing reusable IP, deployment playbooks, and automated Ansible/Helm/Terraform scripts.
- Mentor junior consultants and lead technical proof-of-concepts (PoCs) that demonstrate the performance of Dell-NVIDIA stacks.
Technical Requirements :
1. Infrastructure : Deep expertise in Dell PowerEdge (XE/R series), PowerScale, and PowerSwitch networking.
2. GPU Orchestration : Mastery of NVIDIA GPU Operator, Network Operator, and NVIDIA Base Command Manager (BCM).
3. Cloud-Native : Expert-level Kubernetes (CKA/CKS) or Red Hat OpenShift skills, including complex security, CNI (Cilium/Multus) and storage (CSI) configurations.
4. Workload Management : Experience with Run:ai, Slurm, or Altair PBS for high-concurrency AI environments.
5. ML Platforms : Hands-on experience with Kubeflow, MLflow, Ray, and ClearML.
6. Automation : Advanced Ansible, Helm, Terraform, and Python skills for "Infrastructure as Code" delivery.
Qualifications
1. Education : Bachelors or Masters degree in Computer Science, Software Engineering, or a related technical field.
2. Experience : 10+ years in professional services or consulting, with a heavy focus on AI, Big Data, or HPC infrastructure.
3. Communication : Exceptional client-facing e.g., ability to explain complex GPU-to-GPU communication (NVLink/NVSwitch) to C-level stakeholders.
4. Travel : Willingness to travel to client sites as needed to lead deployments.
5. Preferred Certifications : CKA or Red Hat Certified Specialist, NVIDIA Certified Associate/Professional, Dell PowerEdge/PowerScale Proven Professional.
Did you find something suspicious?
Posted by
Posted in
Platform Engineering / SAP/Oracle
Functional Area
Technical / Solution Architect
Job Code
1630866