Talent.com
Senior Engineering Manager - Compute Server Bring Up
Senior Engineering Manager - Compute Server Bring UpNVIDIA • Santa Clara, CA, US
Senior Engineering Manager - Compute Server Bring Up

Senior Engineering Manager - Compute Server Bring Up

NVIDIA • Santa Clara, CA, US
2 days ago
Job type
  • Full-time
Job description

Senior Engineering Manager

NVIDIA data center systems have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA Networking, NVIDIA Data Center CPUs, and a fully optimized NVIDIA AI and HPC software stack.

We are seeking an excellent Senior Engineering Manager to lead the Compute Server Bring-Up team. This team is responsible for the bringup, integration, validation and troubleshooting for compute tray platforms of GPU Racks ensuring servers are fully functional and validated as per requirement before mass deployment in data centers. You will directly lead all aspects of a group of bringup engineers and form a larger virtual team spanning across NVIDIA software & firmware teams to ensure successful bring up compute platforms both internally and with customers.

What you'll be doing :

  • Own Initial Power-On and Board Bring-Up : Lead the initial power-on and functional validation of compute trays (CPU, GPU, NIC, storage including NVMe, cooling, etc.) internally and with customers. Ensure all functional requirements are met.
  • Form and lead a virtual team across NVIDIA software & firmware teams to ensure subject matter experts are available as needed throughout bringup. Regular reporting on status of bringup to provide visibility and ensure teams across the company are fully activated to help.
  • Oversee flashing, updating, and validation of firmware for all server components as per defined architecture. Ensure appropriate validation done for boundary, stress, and regression testing, and confirm telemetry, logging, and hardware management features working as per requirements. Document pain points, bring up failures, recovery flows, and provide actionable feedback to hardware, firmware, and software teams. Ensure usability, firmware / BIOS update coverage, and error reporting for reliable customer installation and operation
  • Factory & Manufacturing Support : Support manufacturing flows, firmware updates, and diagnostic procedures. Ensure BOM change signoff and process optimization.
  • Debug, Issue Resolution & Customer Support : Lead root cause analysis and resolution of bring-up failures. Collaborate with partners, ODMs, and customers for technical support.
  • Documentation & Knowledge Transfer : Own and maintain platform design guides, bring-up checklists, and install instructions. Provide training and enablement for internal and external teams.
  • Product Ownership : Drive product life cycles with QA teams, ensuring robust bring up, productization, and delivery.
  • Performance Management : Conduct performance evaluations, develop a culture of excellence, and ensure high productivity.

What we need to see :

  • 5+ years of relevant experience managing systems / platform software teams, ideally in server bring up, firmware development, or data center solutions. Deep experience operating successfully in a matrix environment, forming and leading high impact virtual teams spanning multiple disciplines.
  • BS, MS, or PhD in EE / CS or related field (or equivalent experience) with 12+ overall years of experience. Strong knowledge of compute tray designs, firmware enablement, and system-level architecture.
  • Proven track record of delivering scalable server products and solutions for large scale data centers. Experience collaborating with hardware, firmware, manufacturing, diags and QA teams.
  • Experience with SCM (Git, Perforce) and project management tools (Jira).
  • Excellent written and oral communication skills, strong work ethic, and dedication to teamwork.
  • Hands-on experience with x86 / ARM system architecture and coding (C / C++, Python).
  • You are a self-starter who loves to find creative solutions to complicated problems.
  • Proven excellence in server architecture, collaborating across teams for delivering server products as per defined Key Performance Indicators (KPIs).
  • Ways to stand out from the crowd :

  • Experience leading bring-up for sophisticated compute architectures like GB200 NVL72.
  • NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, hardworking and self-motivated, we want to hear from you!

    Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 425,500 USD. You will also be eligible for equity and benefits.

    Applications for this job will be accepted at least until November 25, 2025. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

    Create a job alert for this search

    Senior Engineering Manager • Santa Clara, CA, US

    Related jobs
    Senior Engineering Manager - Storage Platform San Francisco (USA) Remote (USA) Discord USD 304,[...]

    Senior Engineering Manager - Storage Platform San Francisco (USA) Remote (USA) Discord USD 304,[...]

    Gamecompanies • San Francisco, CA, United States
    Remote
    Full-time
    Senior Engineering Manager - Storage Platform - removed.Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our pla...Show more
    Last updated: 30+ days ago • Promoted
    Senior Engineering Manager - Platform

    Senior Engineering Manager - Platform

    Rippling • San Francisco, CA, United States
    Full-time
    Rippling gives businesses one place to run HR, IT, and Finance.It brings together all of the workforce systems that are normally scattered across a company, like payroll, expenses, benefits, and co...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager, Home Platform Cloud

    Engineering Manager, Home Platform Cloud

    Google • Mountain View, CA, US
    Full-time
    A leading technology firm in Mountain View is seeking an experienced Software Engineering Manager to lead the Google Home Platform team. You will manage a team of software engineers responsible for ...Show more
    Last updated: 20 hours ago • Promoted • New!
    Senior Manager - Software Engineering - Public Cloud

    Senior Manager - Software Engineering - Public Cloud

    Salesforce, Inc. • San Francisco, CA, United States
    Full-time
    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job CategorySoftware EngineeringJob Details • • • •Abo...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager, Cloud Storage

    Engineering Manager, Cloud Storage

    Crusoe Energy Systems LLC • San Francisco, CA, United States
    Full-time
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...Show more
    Last updated: 30+ days ago • Promoted
    Senior Manager, Solutions Engineering

    Senior Manager, Solutions Engineering

    Intercom • San Francisco, CA, US
    Full-time
    Senior Manager, Solutions Engineering.Intercom is the AI Customer Service company on a mission to help businesses provide incredible customer experiences. Our AI agent Fin, the most advanced custome...Show more
    Last updated: 4 days ago • Promoted
    Software Engineering Manager, Network Optimization

    Software Engineering Manager, Network Optimization

    Cloudflare Inc • San Francisco, CA, United States
    Full-time
    At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for cust...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineering Manager - Secure Computing Solutions

    Senior Software Engineering Manager - Secure Computing Solutions

    The Boeing Company • Pleasanton, CA, United States
    Temporary
    At Boeing, we innovate and collaborate to make the world a better place.We're committed to fostering an environment for every teammate that's welcoming, respectful and inclusive, with great opportu...Show more
    Last updated: 2 days ago • Promoted
    Software Engineering Manager, Network Routing

    Software Engineering Manager, Network Routing

    META • Menlo Park, CA, United States
    Full-time
    Network Routing team is part of the overall Meta Infrastructure organization and develops software for our entire network. We cover our ever-growing data centers, the global backbone that connects a...Show more
    Last updated: 8 days ago • Promoted
    Senior Engineering Manager - Platform Integrations

    Senior Engineering Manager - Platform Integrations

    Intuit • Mountain View, CA, US
    Full-time
    A leading financial technology company is seeking a Senior Engineering Manager in Mountain View to lead a strategic team focusing on platform integrations and AI innovations.The ideal candidate wil...Show more
    Last updated: 20 hours ago • Promoted • New!
    Engineering Manager, Desktop

    Engineering Manager, Desktop

    anthropic • San Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
    Last updated: 9 days ago • Promoted
    Senior Software Engineering Manager-Lead High-Impact Teams

    Senior Software Engineering Manager-Lead High-Impact Teams

    Salesforce • Palo Alto, CA, US
    Full-time
    A leading cloud computing company in Palo Alto is seeking a Manager / Sr.Manager for Software Engineering.This role involves leading technical discussions, mentoring teams, and ensuring delivery alig...Show more
    Last updated: 20 hours ago • Promoted • New!
    Manager, Solution Engineering

    Manager, Solution Engineering

    Support Revolution • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager - Database Infrastructure

    Engineering Manager - Database Infrastructure

    Discord • San Francisco, CA, United States
    Full-time
    Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform : . Over 90% of our users play games, spending a comb...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager, Desktop

    Engineering Manager, Desktop

    Anthropic • San Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
    Last updated: 9 days ago • Promoted
    Senior Engineering Manager - Accelerated Compute Memory Systems

    Senior Engineering Manager - Accelerated Compute Memory Systems

    Pryon • San Francisco, CA, United States
    Full-time
    Senior Engineering Manager - Accelerated Compute Memory Systems.We’re a team of AI, technology, and language experts whose DNA lives in Alexa, Siri, Watson, and virtually every human language techn...Show more
    Last updated: 6 days ago • Promoted
    Senior Software Engineering Manager - Global E-commerce

    Senior Software Engineering Manager - Global E-commerce

    TikTok • San Jose, CA, United States
    Full-time
    Senior Software Engineering Manager - Global E-commerce.Senior Software Engineering Manager - Global E-commerce.The Seller Business team plays a critical role in driving the success of our Global E...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager, Network Software

    Engineering Manager, Network Software

    Cloudflare, Inc. • San Francisco, CA, United States
    Full-time
    At Cloudflare, we are on a mission to help build a better Internet.The company runs one of the world's largest networks powering websites and other Internet properties for customers ranging from in...Show more
    Last updated: 30+ days ago • Promoted