A company is looking for a Super Intelligence HPC Support Engineer.
Key Responsibilities
Act as the primary technical point of escalation for customers running hyperscale GPU clusters
Lead incident response for complex issues, ensuring rapid triage and timely resolution
Proactively identify risks and drive preventative improvements in large environments
Required Qualifications
7+ years of experience in HPC or cloud support engineering with customer-facing responsibilities
Proven experience managing large-scale Linux clusters and distributed HPC / AI workloads
Deep expertise in orchestration tools such as Kubernetes and / or Slurm
Strong knowledge of GPU technologies (CUDA, NCCL, MIG, NVLink, GPUDirect RDMA)
Skilled in high-throughput networking and cluster storage solutions
Hpc Engineer • Denver, Colorado, United States