Talent.com
Director, Technical Program Management - AI and ML Platforms
Director, Technical Program Management - AI and ML PlatformsNvidia Corporation • Santa Clara, CA, United States
Director, Technical Program Management - AI and ML Platforms

Director, Technical Program Management - AI and ML Platforms

Nvidia Corporation • Santa Clara, CA, United States
12 days ago
Job type
  • Full-time
Job description

The DGX Cloud organization builds and operates the AI infrastructure that makes this innovation possible. We are seeking a Director of Technical Program Management (TPM) to lead AI / ML Platform initiatives within the DGX Cloud Infrastructure team. This role will coordinate extensive, multi-functional programs that compose how NVIDIA researchers develop, train, and deploy AI models on our global DGX Cloud platform. You will lead a team of TPMs responsible for orchestrating compute platforms, cluster bring-ups, workload scheduling, and platform enablement across NVIDIA's most advanced GPU systems.

As Director of Technical Program Management for AI / ML Platforms, your mission is to accelerate NVIDIA's research and product innovation by delivering a resilient, high-performance AI platform that seamlessly integrates hardware, orchestration, and developer productivity. You will bridge NVIDIA Research, DGX Engineering, and Cloud Operations ensuring our infrastructure evolves to meet the rapidly expanding scale and complexity of AI workloads.

What You'll Be Doing :

  • Lead and scale the Technical Program Management organization responsible for the DGX Cloud AI / ML platform, enabling over 1,000+ NVIDIA researchers globally.
  • Drive the roadmap for end-to-end AI / ML infrastructure, spanning cluster bring-up, workload orchestration, GPU resource management, and integration with MLOps pipelines.
  • Collaborate with leaders in technology and innovation to outline platform needs, synchronize computing approach with AI model advancement, and provide a seamless researcher journey.
  • Lead complex programs involving next-generation systems (e.g., GB200) and fleet-wide scaling initiatives across OCI, GCP, and other hyperscalers.
  • Own platform efficiency and capacity management, using deep understanding of scheduling systems (e.g., Slurm, hybrid models) to optimize job placement, utilization, and turnaround time.
  • Establish data-driven operational metrics availability, occupancy, wait times, throughput and use them to guide continuous improvement and prioritization.
  • Implement governance and visibility frameworks that drive alignment, predictability, and accountability across AI platform initiatives.
  • Represent DGX Cloud programs to senior leadership, clearly articulating impact, risk, and value across engineering and research organizations.

What We Need to See :

  • 15+ overall years of technical program management experience, including 7+ years leading and developing TPM teams in infrastructure, AI / ML, or platform engineering domains.
  • Demonstrated success in implementing AI and machine learning systems and platform initiatives at a large scale encompassing workload coordination, data pipeline incorporation, model training environments, and GPU fleet supervision.
  • Deep technical understanding of AI / ML workflows, job scheduling (Slurm, Kubernetes, hybrid orchestration), and large-scale distributed systems.
  • Proficiency in optimizing resource usage and monitoring performance metrics in compute-heavy settings.
  • Experience building platforms across cloud and on-prem hybrid architectures, integrating with internal and external MLOps stacks.
  • Proficiency with observability and telemetry tools (e.g., Grafana, Prometheus) for infrastructure monitoring and performance analysis.
  • Bachelor or Master in Computer Science, Engineering, or related field (or equivalent experience).
  • Ways to Stand Out from the Crowd :

  • Proficient in AI / ML systems, model lifecycle oversight, and developer tools for extensive training tasks.
  • Track record driving R&D productivity platforms and reducing friction for machine learning practitioners.
  • Experience in new product introduction (NPI) for research and infrastructure systems.
  • Deep familiarity with cloud compute and orchestration technologies, and a passion for automation and operational excellence.
  • Executive communication skills, able to translate complex technical programs into clear business and research outcomes.
  • NVIDIA is widely considered one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people on our team. If you're driven, excited by tech and AI, creative and autonomous, we want to hear from you!

    #LI-Hybrid

    Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 264,000 USD - 402,500 USD.

    You will also be eligible for equity and benefits.

    Applications for this job will be accepted at least until November 3, 2025.

    NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

    #J-18808-Ljbffr

    Create a job alert for this search

    Director Program Management • Santa Clara, CA, United States

    Related jobs
    Director, AI-Driven Enterprise Modernization

    Director, AI-Driven Enterprise Modernization

    Amazon • Santa Clara, CA, United States
    Full-time
    A leading technology firm in Santa Clara is seeking a Director of Engineering to spearhead modernization initiatives using AI technologies. This role involves leading a highly skilled team and colla...Show more
    Last updated: 8 days ago • Promoted
    AI / ML Manager - Engineering Leader

    AI / ML Manager - Engineering Leader

    Articul8 • Dublin, CA, United States
    Full-time
    At Articul8, we build enterprise-grade Generative AI solutions that help global organizations unlock new value from their data. Our platform is trusted by some of the world's most innovative compani...Show more
    Last updated: 30+ days ago • Promoted
    Director, AI

    Director, AI

    Infoblox • Santa Clara, CA, United States
    Full-time
    At Infoblox, every breakthrough begins with a bold “what if.What if your ideas could ignite global innovation? What if your curiosity could redefine the future? We invite you to step into the next ...Show more
    Last updated: 27 days ago • Promoted
    Product Management Director, Common Services AI

    Product Management Director, Common Services AI

    salesforce.com, inc. • Palo Alto, CA, United States
    Full-time
    To improve candidate experience, consider applying for a maximum of 3 roles within 12 months to avoid duplicating efforts. Salesforce is the #1 AI CRM, where humans with agents drive customer succes...Show more
    Last updated: 22 days ago • Promoted
    Technical Program Manager : GeminiApp Automation

    Technical Program Manager : GeminiApp Automation

    Google DeepMind • Mountain View, CA, United States
    Full-time
    Technical Program Manager : GeminiApp Automation.Mountain View, California, US; New York City, New York, US.Technical Program Manager (TPM) on Google DeepMind's Gemini App team.The team is described...Show more
    Last updated: 13 days ago • Promoted
    Enterprise Strategy Execution Program Manager - Remote CA

    Enterprise Strategy Execution Program Manager - Remote CA

    Kaiser Permanente • Pleasanton, CA, United States
    Remote
    Full-time
    A leading healthcare organization is seeking a strategic program manager to oversee enterprise-wide programs aimed at delivering exceptional business value. The ideal candidate will possess extensiv...Show more
    Last updated: 5 days ago • Promoted
    Director, Product Management - Wearables Devices

    Director, Product Management - Wearables Devices

    Meta • Sunnyvale, CA, United States
    Full-time
    At Meta, we're driving innovation to empower people to build community and bring the world closer together.Our product teams are creating new ways for people to connect, find communities, and grow ...Show more
    Last updated: 30+ days ago • Promoted
    Director, Technical Program Management

    Director, Technical Program Management

    Hobbsnews • San Jose, CA, United States
    Full-time
    Director, Technical Program Management.Are you interested in leading programs that deliver on critical business goals and build large scale products & platforms?. About Capital One - At Capital One,...Show more
    Last updated: 1 day ago • Promoted
    Director, AI / ML Forward Deployment Engineering

    Director, AI / ML Forward Deployment Engineering

    Advanced Micro Devices, Inc. • Santa Clara, CA, United States
    Full-time
    WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming, and embedded sys...Show more
    Last updated: 10 days ago • Promoted
    Director of Customer Program Management (Strategic Engagements)

    Director of Customer Program Management (Strategic Engagements)

    Synopsys • Sunnyvale, CA, United States
    Full-time
    At Synopsys, we drive innovations shaping the way people live and connect.Our technologies enable breakthroughs in chip design, verification, and IP integration, powering the future of high-perform...Show more
    Last updated: 30+ days ago • Promoted
    Director, Technical Program Management

    Director, Technical Program Management

    Information Technology Senior Management Forum • San Jose, CA, United States
    Full-time
    Director, Technical Program Management.Are you interested in leading programs that deliver on critical business goals and build large scale products & platforms?. About Capital One - At Capital One,...Show more
    Last updated: 2 days ago • Promoted
    Director AI / ML Strategic Customers Engineering

    Director AI / ML Strategic Customers Engineering

    Oracle • Santa Clara, CA, United States
    Full-time
    Director AI / ML Strategic Customers Engineering.Director AI / ML Strategic Customers Engineering.The Strategic Customers Engineering team manages relationships for some of OCI’s top revenue generating...Show more
    Last updated: 10 days ago • Promoted
    Director, Structural Heart Supply Chain

    Director, Structural Heart Supply Chain

    Capstan Medical • Santa Cruz, CA, United States
    Full-time
    Manufacturing and Operations|Santa Cruz, CA.Workplace type : Onsite (as the position requires).Capstan medical is Series C funded company aiming to commercial a robotic platform for delivering struc...Show more
    Last updated: 10 days ago • Promoted
    Medical Director (Part-Time) (Temporary)

    Medical Director (Part-Time) (Temporary)

    Central California Alliance for Health • Scotts Valley, CA, United States
    Full-time +2
    This is a part-time temporary position, and the length of the assignment is estimated to go from November to January with the possibility of extension. The length of the assignment is always depende...Show more
    Last updated: 30+ days ago • Promoted
    Senior Director Technical Program Management - Platforms

    Senior Director Technical Program Management - Platforms

    Pinterest • Palo Alto, CA, United States
    Full-time
    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show more
    Last updated: 15 days ago • Promoted
    Director, Technical Product Management / Technology Enablement (26496)

    Director, Technical Product Management / Technology Enablement (26496)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Director, AI Platform - Product Management

    Director, AI Platform - Product Management

    Symphony Industrial AI, Inc. • Palo Alto, CA, United States
    Full-time
    Director, AI Platform Product Management – SymphonyAI Retail.SymphonyAI Retail is seeking an innovative Director of AI Platform - Product Management to lead our next-generation AI Platform—powering...Show more
    Last updated: 30+ days ago • Promoted
    AI / ML Engineer

    AI / ML Engineer

    InterSources • Fremont, CA, United States
    Temporary
    Experience in AI / ML development, with focus on OpenAI services, NLPs and LLMs.Ability to fine-tune pre-trained models for custom tailored solutions. Drive AI-powered automation for testing and test-...Show more
    Last updated: 19 days ago • Promoted