Talent.com
Head of Platform / AI Cluster Management - System Integrator

Head of Platform / AI Cluster Management - System Integrator

Hamilton Barnes Associates LimitedSan Francisco, CA, United States
23 hours ago
Job type
  • Full-time
Job description

Ready to lead innovation at the intersection of platforms and artificial intelligence?

Join a pioneering technology company driving advancements in cloud, AI, and data-driven solutions across global markets. The organization is recognized for fostering innovation, scalability, and collaboration through cutting-edge platforms that empower enterprises to evolve intelligently.

The team is hiring a Head of Platform / AI Cluster Management to oversee the strategic development, integration, and optimization of AI and platform initiatives. The role will focus on leading cross-functional teams, enhancing performance and scalability, and aligning technology strategy with long-term business goals.

Shape the future of intelligent platforms and transformative innovation. Apply now!

Responsibilities

  • Own the scheduler / runtime layer (Slurm, Kubernetes, Ray), including multi-tenancy, quotas, and GPU / host fleet management.
  • Lead cluster operations across images, CI / CD, repair / health, performance / telemetry, and incident response.
  • Deliver platform services that ensure workload SLOs and reliable runtime execution.
  • Define and implement namespace / tenancy design, node health automation, golden images, admission controls, on-call runbooks, and go-live gates.
  • Collaborate closely with infra, SRE, and network teams to optimize workload placement and cluster efficiency.
  • Provide hands-on expertise in NCCL behaviours, placement strategies, and congestion signal management.

Requirements

  • Deep expertise in cluster management, scheduling, and runtime environments for large-scale compute.
  • Hands-on background with Slurm, Kubernetes, Ray, or similar orchestration platforms.
  • Strong understanding of NCCL performance tuning, workload isolation, and congestion management.
  • Experience scaling multi-tenant, GPU-heavy clusters with strict SLOs.
  • Ability to thrive in a startup environment with full ownership over platform and cluster strategy.
  • Salary

  • $500,000 gross per year (Negotiable)
  • #J-18808-Ljbffr

    Create a job alert for this search

    System Of Management • San Francisco, CA, United States

    Related jobs
    • Promoted
    Head of AI Platform & Ecosystem Partnerships

    Head of AI Platform & Ecosystem Partnerships

    CrusoeSan Francisco, CA, United States
    Full-time
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI, without sacrificing scale, spee...Show moreLast updated: 10 days ago
    • Promoted
    Head of AI Platform

    Head of AI Platform

    Abridge Al, IncSan Francisco, CA, United States
    Full-time
    Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare.Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation eff...Show moreLast updated: 3 days ago
    • Promoted
    Head of Alliances and Partnerships (REMOTE)

    Head of Alliances and Partnerships (REMOTE)

    Upbound - Job PostingSan Francisco, CA, United States
    Remote
    Full-time
    Head of Alliances and Partnerships (REMOTE).Upbound is redefining how modern infrastructure is built.As the creators of Crossplane and the pioneers of the Intelligent Control Plane, we are leading ...Show moreLast updated: 25 days ago
    • Promoted
    • New!
    Head of AI Engineering

    Head of AI Engineering

    CommerceIQSan Francisco, California, United States
    Full-time
    Director of Talent Acquisition at CommerceIQ Company Overview : CommerceIQ’s AI-powered digital commerce platform is revolutionizing the way brands sell online. Our unified ecommerce management solut...Show moreLast updated: 13 hours ago
    • Promoted
    Global Head - AI Systems

    Global Head - AI Systems

    NokiaSunnyvale, CA, United States
    Full-time
    In an increasingly connected world, the pandemic has highlighted just how essential telecom networks are to keeping society running. With the recent acquisition of Infinera, we've united two industr...Show moreLast updated: 23 hours ago
    • Promoted
    Head of Platform

    Head of Platform

    Bridge AnalyticsAlameda, CA, United States
    Full-time
    Head of Platform, Bridge Analytics.South San Francisco, CA (Hybrid, 2 days / week in office) |.Bridge Analytics is a new, not-for-profit organization with a vital mission : to accelerate the developme...Show moreLast updated: 5 days ago
    • Promoted
    Director, Engineering - Head of Applied AI

    Director, Engineering - Head of Applied AI

    StarburstSan Francisco, CA, United States
    Full-time
    Director, Engineering - Head of Applied AI.Starburst is the data platform for analytics, applications, and AI, unifying data across clouds and on-premises to accelerate AI innovation.Organizations—...Show moreLast updated: 30+ days ago
    • Promoted
    Head of Engineering @ Callidus AI

    Head of Engineering @ Callidus AI

    careerSan Francisco, CA, United States
    Full-time
    We are entering the golden era of AI.Over the next 5-10 years, breakthroughs will drive a completely new world.Callidus is on a mission to transform the $1T global legal market and we’re well on ou...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Head of AI Architecture

    Head of AI Architecture

    ConfidentialSan Jose, CA, United States
    Full-time
    The Company is in search of a visionary Head of AI Architecture to lead its data GenAI, LLM, and Machine Learning transformation. The successful candidate will be responsible for guiding the organiz...Show moreLast updated: 22 hours ago
    • Promoted
    • New!
    Head of Platform Engineering

    Head of Platform Engineering

    DescriptSan Francisco, California, United States
    Full-time
    About The Team We are seeking a Head of Platform Engineering to lead and scale Descript’s Platform organization.Operating as an internal product team, the Platform group focuses on providing tools,...Show moreLast updated: 13 hours ago
    • Promoted
    Head of AI

    Head of AI

    MLabsPalo Alto, CA, United States
    Full-time
    Our client, an AI company focused on dramatically accelerating semiconductor design, is seeking a.They are solving the critical problem of exponentially increasing chip design costs by providing an...Show moreLast updated: 8 days ago
    • Promoted
    Head AI Architect

    Head AI Architect

    TEPHRASan Francisco, CA, United States
    Full-time
    Location : San Francisco, CA (Bay Area).A full-stack AI engineering specialist responsible for designing and implementing production grade AI / ML systems that scale across enterprise use cases.You wi...Show moreLast updated: 30+ days ago
    • Promoted
    Head of AI Strategy

    Head of AI Strategy

    LumentumSan Jose, CA, United States
    Full-time
    It’s fun to work in a company where people truly BELIEVE in what they’re doing!.We’re committed to bringing passion and customer focus to the business. If you like wild growth and working with happy...Show moreLast updated: 4 days ago
    • Promoted
    Head of Technical Support and AI Integration

    Head of Technical Support and AI Integration

    Clerk ChatSan Francisco, CA, United States
    Full-time
    Head Of Technical Support And Ai Integration.Clerk Chat's mission is to make every business conversational.We are achieving this by building the leading messaging application, integrating AI where ...Show moreLast updated: 30+ days ago
    • Promoted
    Head of Data and AI Platforms

    Head of Data and AI Platforms

    ConfidentialSan Jose, CA, United States
    Full-time
    Premier technology organization.Information Technology and Services.The Company is in search of a Head of Product for Data and AI Platforms. This senior leadership role is pivotal in driving the str...Show moreLast updated: 23 hours ago
    • Promoted
    • New!
    Head of Data Center Engineering

    Head of Data Center Engineering

    Blue Signal LLCSan Francisco, CA, United States
    Full-time
    Head of Data Center Engineering.Our client is pioneering the future of AI infrastructure, developing next-generation data centers built to support the explosive growth of artificial intelligence, m...Show moreLast updated: 22 hours ago
    • Promoted
    Head of Integration Engineering

    Head of Integration Engineering

    SullySan Francisco, CA, United States
    Full-time
    Team from OpenAI, DeepMind, NASA, GoogleX, Tesla, and 2 physicians : 6 exits, 2 IPOs.Our model outperforms Claude, Gemini, and GPT-4. M raised from YC, Amity Ventures, Sequoia scouts, and more.Own th...Show moreLast updated: 23 hours ago
    • Promoted
    Remote Head of AI - Sei Foundation

    Remote Head of AI - Sei Foundation

    Blockchain WorksSan Francisco, CA, United States
    Remote
    Full-time
    The Sei Foundation is seeking an innovative and strategic Head of Artificial Intelligence to spearhead the growth and integration of AI-focused applications within the Sei blockchain ecosystem.This...Show moreLast updated: 30+ days ago