Talent.com
Senior Engineering Manager - Compute Server Bring Up
Senior Engineering Manager - Compute Server Bring UpNVIDIA • Santa Clara, CA, US
No se aceptan más aplicaciones
Senior Engineering Manager - Compute Server Bring Up

Senior Engineering Manager - Compute Server Bring Up

NVIDIA • Santa Clara, CA, US
Hace 3 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Senior Engineering Manager

NVIDIA data center systems have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA Networking, NVIDIA Data Center CPUs, and a fully optimized NVIDIA AI and HPC software stack.

We are seeking an excellent Senior Engineering Manager to lead the Compute Server Bring-Up team. This team is responsible for the bringup, integration, validation and troubleshooting for compute tray platforms of GPU Racks ensuring servers are fully functional and validated as per requirement before mass deployment in data centers. You will directly lead all aspects of a group of bringup engineers and form a larger virtual team spanning across NVIDIA software & firmware teams to ensure successful bring up compute platforms both internally and with customers.

What you'll be doing :

  • Own Initial Power-On and Board Bring-Up : Lead the initial power-on and functional validation of compute trays (CPU, GPU, NIC, storage including NVMe, cooling, etc.) internally and with customers. Ensure all functional requirements are met.
  • Form and lead a virtual team across NVIDIA software & firmware teams to ensure subject matter experts are available as needed throughout bringup. Regular reporting on status of bringup to provide visibility and ensure teams across the company are fully activated to help.
  • Oversee flashing, updating, and validation of firmware for all server components as per defined architecture. Ensure appropriate validation done for boundary, stress, and regression testing, and confirm telemetry, logging, and hardware management features working as per requirements. Document pain points, bring up failures, recovery flows, and provide actionable feedback to hardware, firmware, and software teams. Ensure usability, firmware / BIOS update coverage, and error reporting for reliable customer installation and operation
  • Factory & Manufacturing Support : Support manufacturing flows, firmware updates, and diagnostic procedures. Ensure BOM change signoff and process optimization.
  • Debug, Issue Resolution & Customer Support : Lead root cause analysis and resolution of bring-up failures. Collaborate with partners, ODMs, and customers for technical support.
  • Documentation & Knowledge Transfer : Own and maintain platform design guides, bring-up checklists, and install instructions. Provide training and enablement for internal and external teams.
  • Product Ownership : Drive product life cycles with QA teams, ensuring robust bring up, productization, and delivery.
  • Performance Management : Conduct performance evaluations, develop a culture of excellence, and ensure high productivity.

What we need to see :

  • 5+ years of relevant experience managing systems / platform software teams, ideally in server bring up, firmware development, or data center solutions. Deep experience operating successfully in a matrix environment, forming and leading high impact virtual teams spanning multiple disciplines.
  • BS, MS, or PhD in EE / CS or related field (or equivalent experience) with 12+ overall years of experience. Strong knowledge of compute tray designs, firmware enablement, and system-level architecture.
  • Proven track record of delivering scalable server products and solutions for large scale data centers. Experience collaborating with hardware, firmware, manufacturing, diags and QA teams.
  • Experience with SCM (Git, Perforce) and project management tools (Jira).
  • Excellent written and oral communication skills, strong work ethic, and dedication to teamwork.
  • Hands-on experience with x86 / ARM system architecture and coding (C / C++, Python).
  • You are a self-starter who loves to find creative solutions to complicated problems.
  • Proven excellence in server architecture, collaborating across teams for delivering server products as per defined Key Performance Indicators (KPIs).
  • Ways to stand out from the crowd :

  • Experience leading bring-up for sophisticated compute architectures like GB200 NVL72.
  • NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, hardworking and self-motivated, we want to hear from you!

    Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 425,500 USD. You will also be eligible for equity and benefits.

    Applications for this job will be accepted at least until November 25, 2025. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

    Crear una alerta de empleo para esta búsqueda

    Senior Engineering Manager • Santa Clara, CA, US

    Ofertas relacionadas
    Senior Engineering Manager - Storage Platform San Francisco (USA) Remote (USA) Discord USD 304,[...]

    Senior Engineering Manager - Storage Platform San Francisco (USA) Remote (USA) Discord USD 304,[...]

    Gamecompanies • San Francisco, CA, United States
    Teletrabajo
    A tiempo completo
    Senior Engineering Manager - Storage Platform - removed.Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our pla...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Engineering Manager - Platform

    Senior Engineering Manager - Platform

    Rippling • San Francisco, CA, United States
    A tiempo completo
    Rippling gives businesses one place to run HR, IT, and Finance.It brings together all of the workforce systems that are normally scattered across a company, like payroll, expenses, benefits, and co...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Manager - Software Engineering - Public Cloud

    Senior Manager - Software Engineering - Public Cloud

    Salesforce, Inc. • San Francisco, CA, United States
    A tiempo completo
    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job CategorySoftware EngineeringJob Details • • • •Abo...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Software Engineering Manager, Apple Cloud Networking

    Software Engineering Manager, Apple Cloud Networking

    Apple Inc. • Sunnyvale, CA, United States
    A tiempo completo
    Software Engineering Manager, Apple Cloud Networking.Sunnyvale, California, United States Software and Services.Apple Cloud Networking team builds and operates Software-defined network platforms th...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Software Engineering Manager, Compute Systems Software

    Senior Software Engineering Manager, Compute Systems Software

    General Motors • Mountain View, CA, United States
    A tiempo completo
    Hybrid : This role is categorized as hybrid.This means the successful candidate is expected to report to Mountain View, CA, three times per week, at minimum. The Vehicle Experiences Engine (VEE) at G...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Engineering Manager, Cloud Storage

    Engineering Manager, Cloud Storage

    Crusoe Energy Systems LLC • San Francisco, CA, United States
    A tiempo completo
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior Manager, Solutions Engineering

    Senior Manager, Solutions Engineering

    Intercom • San Francisco, CA, US
    A tiempo completo
    Senior Manager, Solutions Engineering.Intercom is the AI Customer Service company on a mission to help businesses provide incredible customer experiences. Our AI agent Fin, the most advanced custome...Mostrar más
    Última actualización: hace 5 días • Oferta promocionada
    Senior Software Engineering Manager - Secure Computing Solutions

    Senior Software Engineering Manager - Secure Computing Solutions

    Boeing • Pleasanton, CA, United States
    A tiempo completo
    Prioritizing the development and career growth of your employees and team • Inspiring and empowering your team through collaboration, communication, and caring • Building and nurturing an inclusive c...Mostrar más
    Última actualización: hace 2 horas • Oferta promocionada • Nueva oferta
    Senior Manager, Engineering

    Senior Manager, Engineering

    Backbone • San Francisco, CA, United States
    A tiempo completo
    At Backbone, we’re on a mission to redefine the way people play games.We create seamless, captivating experiences across mobile devices, tablets, TVs, and more. As we expand our product suite and in...Mostrar más
    Última actualización: hace 9 días • Oferta promocionada
    Senior Software Engineering Manager - Secure Computing Solutions

    Senior Software Engineering Manager - Secure Computing Solutions

    The Boeing Company • Pleasanton, CA, United States
    Temporal
    At Boeing, we innovate and collaborate to make the world a better place.We're committed to fostering an environment for every teammate that's welcoming, respectful and inclusive, with great opportu...Mostrar más
    Última actualización: hace 3 días • Oferta promocionada
    Software Engineering Manager, Network Routing

    Software Engineering Manager, Network Routing

    META • Menlo Park, CA, United States
    A tiempo completo
    Network Routing team is part of the overall Meta Infrastructure organization and develops software for our entire network. We cover our ever-growing data centers, the global backbone that connects a...Mostrar más
    Última actualización: hace 9 días • Oferta promocionada
    Engineering Manager, Desktop

    Engineering Manager, Desktop

    anthropic • San Francisco, CA, United States
    A tiempo completo
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Mostrar más
    Última actualización: hace 9 días • Oferta promocionada
    Manager, Solution Engineering

    Manager, Solution Engineering

    Support Revolution • San Jose, CA, United States
    A tiempo completo
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Engineering Manager - Database Infrastructure

    Engineering Manager - Database Infrastructure

    Discord • San Francisco, CA, United States
    A tiempo completo
    Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform : . Over 90% of our users play games, spending a comb...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Engineering Manager, Desktop

    Engineering Manager, Desktop

    Anthropic • San Francisco, CA, United States
    A tiempo completo
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Mostrar más
    Última actualización: hace 9 días • Oferta promocionada
    Senior Engineering Manager - Accelerated Compute Memory Systems

    Senior Engineering Manager - Accelerated Compute Memory Systems

    Pryon • San Francisco, CA, United States
    A tiempo completo
    Senior Engineering Manager - Accelerated Compute Memory Systems.We’re a team of AI, technology, and language experts whose DNA lives in Alexa, Siri, Watson, and virtually every human language techn...Mostrar más
    Última actualización: hace 7 días • Oferta promocionada
    Senior Backend Engineering Manager - Drive Impact

    Senior Backend Engineering Manager - Drive Impact

    Capital One National Association • San Francisco, CA, United States
    A tiempo completo
    A financial services company is seeking a Manager, Software Engineering, to lead diverse technology projects and a team of developers. This role requires at least 4 years of professional software en...Mostrar más
    Última actualización: hace 2 horas • Oferta promocionada • Nueva oferta
    Engineering Manager, Core Backend Services

    Engineering Manager, Core Backend Services

    Whatnot • San Francisco, CA, United States
    A tiempo completo
    Join the Future of Commerce with Whatnot!.Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. We’re re‑defining e‑commerce by bl...Mostrar más
    Última actualización: hace 2 días • Oferta promocionada