Talent.com
HPC Support Engineer

HPC Support Engineer

VirtualVocationsHuntsville, Alabama, United States
4 days ago
Job type
  • Full-time
Job description

A company is looking for a Super Intelligence HPC Support Engineer.

Key Responsibilities

Act as the primary technical point of escalation for customers running hyperscale GPU clusters

Lead incident response for complex issues, ensuring rapid triage and timely resolution

Proactively identify risks and drive preventative improvements in large environments

Required Qualifications

7+ years of experience in HPC or cloud support engineering with customer-facing responsibilities

Proven experience managing large-scale Linux clusters and distributed HPC / AI workloads

Deep expertise in orchestration tools such as Kubernetes and / or Slurm

Strong knowledge of GPU technologies (CUDA, NCCL, MIG, NVLink, GPUDirect RDMA)

Skilled in high-throughput networking and cluster storage solutions

Create a job alert for this search

Hpc Engineer • Huntsville, Alabama, United States