About
Hydra Host is a Founders Fund-backed NVIDIA cloud partner building the infrastructure platform that powers AI at scale. We connect AI Factories - high-performance GPU data centers - with the teams that depend on them : research labs training foundation models, enterprises running production inference, and developer platforms demanding scalable compute capacity.
This isn't a traditional support role. We're solving hard problems at the intersection of physical infrastructure and digital platforms. When a GPU cluster goes down, when AI team is experiencing network degradation, when an SLA on enterprise contract is breached, you're the person who ensures accountability, drives resolution, and documents what went wrong so it never happens again.
We've scaled fast, and our support systems need to catch up. You'll build them almost from scratch - establishing processes, defining SLAs, implementing tooling, and creating the operational discipline that ensures our infrastructure runs with the same precision as the silicon we deploy.
The AI infrastructure layer is being built right now. Companies are scrambling to secure GPU capacity, deploy clusters, and monetize excess compute. We're at the center of that transformation.
You'll work with world-class customers deploying next-gen AI hardware. You'll help solve real infrastructure challenges - not hypothetical SaaS edge cases. And you'll do it alongside a team that values craftsmanship, and moving fast without breaking things (especially GPUs).
What you'll do
- Handle multi-tier support operations. Manage support for demand-side users (developers and ML teams using GPU compute), enterprise clients (companies deploying private infrastructure), and supply-side operators (data centers running Brokkr). Each has different SLAs, escalation paths, and support needs
- Work with engineering to solve hard problems. Partner with infrastructure engineers and platform developers to troubleshoot complex issues. You'll figure out what's broken, pull in the right people, get to resolution, and make sure the same problem doesn't happen again
- Own vendor accountability. When hardware fails, firmware has issues, or deliveries are delayed, you're the quarterback. Track vendor performance, escalate issues, negotiate remediation, and ensure our customers aren't left holding the bag for supplier problems
- Manage SLA compliance and breaches. Define, track, and enforce SLAs across customer tiers and service types. When breaches occur, coordinate incident response, manage customer communication, and drive postmortem processes
- Build support infrastructure. Establish ticketing systems, escalation matrices, on-call rotations, playbooks, and knowledge bases. Create the scaffolding that lets support scale from hundreds of customers to tens of thousands without breaking
- Scale the function. As we grow, hire and mentor a support team. Build the culture and operating principles for support at Hydra Host
Required qualifications
4+ years in customer support, success, or operations roles at technical B2B companiesExperience supporting infrastructure, critical, or physical products where uptime and reliability matterTrack record of building support processes and systems and scaling them through rapid growthComfortable with technical concepts : APIs, server infrastructure, networking basics, cloud platformsStellar communication skills - you can explain complex technical issues clearly to non-technical stakeholdersBias toward action and ownership. You see a problem, you fix itPreferred qualifications
Experience in AI / ML infrastructure, GPU compute, or data center operationsExperience implementing systems automation and AI Agents, support botsFamiliarity with modern support & CRM platforms (Zendesk, Intercom, Pylon, Linear, HubspotPrior experience managing or mentoring support teamsWhat we value
Customer obsession - You genuinely care about solving customers problems, communicate proactively, and never leave customers in the darkPrincipled thinking - You make decisions based on clear principles, not politics or convenience. When there's ambiguity, you fall back on what's right for the customer and the business long-termTechnical curiosity - You love learning how things work, even when it's outside your immediate domainSystems thinking - You don't just solve one-off issues; you identify root causes and build solutions that prevent them from recurringWhat we offer
Equity ownership - Meaningful stake in what we're building togetherCompetitive salary - We pay fairly and transparentlyHealthcare coverage - Medical, dental, vision for you and your dependentsFully remote team - Remote-first with hubs in Phoenix, Boulder, Miami and periodic team offsitesDirect impact - Your work will shape how thousands of GPU clusters get deployed and operated. Early team means your fingerprints are on everything