Talent.com
Senior Manager, Professional Services HPC Deployment
Senior Manager, Professional Services HPC DeploymentNVIDIA • Santa Clara, CA, US
Senior Manager, Professional Services HPC Deployment

Senior Manager, Professional Services HPC Deployment

NVIDIA • Santa Clara, CA, US
30+ days ago
Job type
  • Full-time
  • Remote
Job description

NVIDIA is in search of an HPC Deployment Manager to bolster its Professional Services division. Across academia and industry, NVIDIA's products are driving ground-breaking advancements in deep learning, data analytics, and the optimization of data centers. Join our team, where we are at the forefront of constructing some of the globe's most expansive and rapid data centers! We seek an individual capable of supervising the deployment of cutting-edge InfiniBand and Ethernet technologies with a team comprising AI and HPC experts. This role demands dynamic interpersonal abilities and a customer-centric approach.

The chosen candidate will engage with clients, collaborators, and internal units to assess, delineate, and complete large-scale AI/HPC initiatives. They will orchestrate the day-to-day operations, guidance, and cultivation of a multi-layered team of HPC service professionals. This entails ensuring the timely delivery of a varied spectrum of AI HPC data center projects. Furthermore, this role offers an opportunity to thrive within a fast-paced, inventive, and technologically sophisticated atmosphere, emphasizing unparalleled performance and the exploration of an array of novel hardware and software technologies in AI supercomputing.

What you will be doing:

  • Directs and supervises the service HPC engineering functions in designing, developing, installing, and validating hardware and software for the Customer AI High-Performance Computing (HPC) systems.

  • Leads, handles, mentors, and builds a very hardworking HPC service engineering team to deliver innovative advances in high-performance computing AI systems.

  • Responsible for leading our HPC projects' planning, implementation, and performance. Improves the integrity of system services bring-up and related by applying groundbreaking technical and operational knowledge to configure and maintain HPC AI network and server platforms.

  • Drives HPC team hardware and software deployment, plans, develops, and deploys procedures for system validation.

  • Lead team activities and drive tests and plans for Customer's HPC AI systems implementations, custom scripts, and testing procedures to ensure operational reliability for the system.

  • Supports the HPC Engineering team, working with other internal collaborators to develop and run a well-rounded strategy for delivering service quality and continuous service improvement. Supports governance for software engineering through the implementation of standards and quality measures.

  • Leads team member development, helping them set and achieve goals for their career growth. Develop an inclusive environment that values team member differences, creating a sense of belonging and appreciation. Chips in to a culture of trust and clarity.

  • Build strong relationships with INVIDIA leaders, customers, partners, and collaborators. Works closely to identify, implement, and support leading NVIDIA's AI solutions engineering, maintaining currency with industry standards and innovations. Provides input around process optimization, department budgeting, and the monitoring and management of resources.

  • Be the domain authority with customers during planning calls through implementation.

What we need to see:

  • 8+ overall years' experience in IT, high-performance computing, or other related field; 3+ years of experience in a management or leadership role

  • Demonstrated expertise in HPC systems design configuration and planning.

  • Proficiency with low latency/high-bandwidth interconnect infrastructure (Infiniband and Ethernet).

  • Expertise with HPC system software cluster management/provisioning tools, including job schedulers (Slurm, salt, xCAT).

  • Proficiency with shared and distributed memory parallelism (OpenMP, MPI, NCCL and HPL) and accelerators (GPUs).

  • Strong scripting ability (Bash, Perl, Python, etc.) and experience with programming fundamentals.

  • Expertise with administration, supervising and maintaining secure Linux/Unix operating systems (CentOS, Solaris).

  • Experience establishing processes for maintaining system performance, managing best-in-class standards, and familiarity with cloud computing and container technologies.

  • Ability to understand and work with large, sophisticated systems, identify and resolve problems, handle performance, and troubleshoot network issues related to infrastructure.

  • Expertise with multi-vendor hardware/software management, security, and network/Internet protocols. Strong communication and social skills, with the ability to provide detailed information and high-level summaries to management-level individuals and groups, present the business side of technical topics to non-technical audiences, and develop positive working relationships and strong rapport with team members.

  • Bachelor's degree in computer science, information systems, or a related field or equivalent experience

  • Solid knowledge of HPC storage

  • Exemplary communication and interpersonal skills, with the ability to present the business side of technical topics to non-technical audiences and persuasively and optimally get along with relationships with various stakeholders and diverse individuals and groups

Ways to stand out from the crowd:

  • InfiniBand experience.

  • Experience with GPU-focused hardware/software.

  • Experience with MPI.

  • Automation tooling background (Ansible, Salt, Puppet, etc.).

  • Ethernet and Storage technologies such as Lustre or GPFS.

The base salary range is 208,000 USD - 327,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Create a job alert for this search

Senior Manager, Professional Services HPC Deployment • Santa Clara, CA, US

Similar jobs
Senior Product Manager, Space Platform & System Integration

Senior Product Manager, Space Platform & System Integration

Muon Space • San Jose, CA, United States
Full-time
A leading aerospace technology company in San Jose is seeking an experienced Senior Product Manager to lead the Muon Halo platform development.The successful candidate will translate technical capa...Show more
Last updated: 14 days ago • Promoted
Senior Service Delivery Manager

Senior Service Delivery Manager

TechDigital Group • Santa Clara, CA, United States
Full-time
Overall, 10 to 15 Years' experience within Consulting/ Solutioning/Implementation experience in Oracle ERP fusion application.Expertise in at least two Fusion SCM modules (Fusion Procurement, Order...Show more
Last updated: 15 days ago • Promoted
Senior Product Manager-Muon Halo Mountain View, CA or Remote

Senior Product Manager-Muon Halo Mountain View, CA or Remote

Muon Space inc • Mountain View, CA, United States
Remote
Full-time
We are seeking an experienced Product Manager to join our product team, reporting to the VP of Product.Given the extensive scope of our platform—spanning satellite hardware, software systems, and m...Show more
Last updated: 30+ days ago • Promoted
Senior Grants Manager (PSF)

Senior Grants Manager (PSF)

EPIP • Palo Alto, CA, United States
Full-time
Pacific Foundation Services, LLC (PFS) is a professional services firm that manages 30 separate and independent family foundations.We provide whatever a foundation needs to operate optimally, inclu...Show more
Last updated: 16 days ago • Promoted
Senior Manager Technical Enablement

Senior Manager Technical Enablement

Uniphore Technologies Inc. • Palo Alto, CA, United States
Full-time
Location: Onsite 3 days a wee in Palo Alto, CA (no remote)**As the **Senior Manager Technical Enablement**, you will elevate the technical depth, readiness, and field excellence of Uniphore’s Sales...Show more
Last updated: 19 days ago • Promoted
Senior Manager, Technical Solutions Manager

Senior Manager, Technical Solutions Manager

CoreWeave • Sunnyvale, CA, United States
Permanent
CoreWeave is The Essential Cloud for AI™.Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....Show more
Last updated: 19 days ago • Promoted
Senior Professional Services Consultant

Senior Professional Services Consultant

Ivanti • Mountain View, CA, United States
Full-time
Professional Services Resource (Tokyo, Japan).Are you a skilled technical project leader and bilingual communicator who thrives at the intersection of customer success and innovative tech?.If you’r...Show more
Last updated: 30+ days ago • Promoted
Senior Manager, Corporate IT

Senior Manager, Corporate IT

Precision Neuroscience • Santa Clara, CA, United States
Full-time
Join to apply for the Senior Manager, Corporate IT role at Precision Neuroscience.Precision Neuroscience is building a next-generation brain–computer interface (BCI) to heal and empower millions of...Show more
Last updated: 16 days ago • Promoted
Global VP, Professional Services: Strategy & Delivery

Global VP, Professional Services: Strategy & Delivery

Rubrik • Palo Alto, CA, United States
Full-time
A leading data security company is seeking a Vice President of Professional Services to scale its global operations.The role involves defining the strategic vision, managing the Professional Servic...Show more
Last updated: 30+ days ago • Promoted
Senior Product Manager, Ad Serving

Senior Product Manager, Ad Serving

Samsung Ads • Mountain View, CA, United States
Full-time
Senior Product Manager, Ad Serving.Samsung Ads is a cutting-edge media and technology business.We are powered by 100Ms of smart TVs and connected devices, and the industry’s largest first party ACR...Show more
Last updated: 30+ days ago • Promoted
Senior Hardware Programs Lead — Equity & Unlimited PTO

Senior Hardware Programs Lead — Equity & Unlimited PTO

Aeva, Inc. • Mountain View, CA, United States
Full-time
A technology company in Mountain View is seeking a Senior Systems Engineering Program Manager to lead hardware programs from concept to production.This role requires over 10 years of experience in ...Show more
Last updated: 19 days ago • Promoted
Senior Data Center Services Product Leader — Drive Growth

Senior Data Center Services Product Leader — Drive Growth

Support Revolution • San Jose, CA, United States
Full-time
A leading technology firm is seeking a Sr.Product Manager to drive strategic initiatives in their data center business.The ideal candidate will oversee product development, shape strategies, and le...Show more
Last updated: 19 days ago • Promoted
Senior Manager, Product Management - ICX RTM

Senior Manager, Product Management - ICX RTM

Adobe, Inc. • San Jose, CA, United States
Full-time
Senior Manager, Product Management - ICX RTM (CRM & Omnichannel Platforms).The ATS Creativity & Productivity (C&P) Business Capability, Design & Delivery (BCDD) team is looking for an experienced S...Show more
Last updated: 2 days ago • Promoted
Manager / Senior Manager, Global Compliance Management Services

Manager / Senior Manager, Global Compliance Management Services

KPMG • Santa Clara, CA, United States
Full-time
At KPMG, you can become an integral part of a dynamic team at one of the world's top tax firms.Enjoy a collaborative, future-forward culture that empowers your success.Work with KPMG's extensive ne...Show more
Last updated: 30+ days ago • Promoted
Senior Engineering Manager, Network Observability

Senior Engineering Manager, Network Observability

Crusoe • Sunnyvale, CA, United States
Full-time
Crusoe is on a mission to accelerate the abundance of energy and intelligence.As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of ...Show more
Last updated: 5 days ago • Promoted
Senior Service Delivery Manager

Senior Service Delivery Manager

HCLTech • San Jose, CA, United States
Full-time
HCLTech is looking for a highly talented and self-motivated Service Delivery Manager to join it in advancing the technological world through innovation and creativity.Job Title: Service Delivery Ma...Show more
Last updated: 6 days ago • Promoted
GTM Senior Manager, Business Professional

GTM Senior Manager, Business Professional

Adobe Inc. • San Jose, CA, United States
Full-time
Changing the world through digital experiences is what Adobe's all about.We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital exper...Show more
Last updated: 30+ days ago • Promoted
Remote-Eligible SVP, Global Professional Services & Growth

Remote-Eligible SVP, Global Professional Services & Growth

Rimini Street, Inc • Pleasanton, CA, United States
Remote
Full-time
A leading technology services company is seeking a Senior Vice President & General Manager for Global Professional Services.The role involves defining strategies for service offerings and leading a...Show more
Last updated: 19 days ago • Promoted