Talent.com
NVIDIA
Senior Deep Learning Systems Software Engineer - AI InfrastructureNVIDIA • Santa Clara, CA, US
Senior Deep Learning Systems Software Engineer - AI Infrastructure

Senior Deep Learning Systems Software Engineer - AI Infrastructure

NVIDIA • Santa Clara, CA, US
30+ days ago
Job type
  • Full-time
  • Remote
Job description

NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is seeking senior engineers who are mindful of performance analysis and optimization to help us squeeze every last clock cycle out of all facets of Deep Learning such as training and inferencing, one of today's most important workloads in the world. If you are unafraid to work across all layers of the hardware/software stack from GPU architecture to Deep Learning Framework to achieve peak performance, we want to hear from you! This role offers an opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads the AI revolution while helping deep learning users around the globe enjoy ever-higher training speeds.

What you'll be doing:

  • Understand, analyze, profile, and optimize deep learning workloads on state-of-the-art hardware and software platforms.

  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

  • Collaborate with cross-functional teams to analyze and optimize cloud application performance on diverse GPU architectures.

  • Identify bottlenecks and inefficiencies in application code and propose optimizations to enhance GPU utilization.

  • Drive end-to-end platform optimization from a hardware level to the application and service levels

  • Design and implement performance benchmarks and testing methodologies to evaluate application performance.

  • Provide guidance and recommendations on optimizing cloud-native applications for speed, scalability, and resource efficiency.

  • Share knowledge and best practices with domain expert teams as they transition applications to distributed environments.

What we need to see:

  • Masters in CS, EE or CSEE or equivalent experience

  • 8+ years of experience in application performance engineering

  • Experience using large scale multi node GPU infrastructure on premise or in CSPs

  • Background in deep learning model architectures and experience with Pytorch and large scale distributed training

  • Experience with application profiling tools such as NVIDIA NSight, Intel VTune etc.

  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. Experience with NVIDIA's Infrastructure and software stacks.

  • Proven experience analyzing, modeling and tuning DL application performance.

  • Proficiency in Python and C/C++ for analyzing and optimizing application code

Ways to stand out from the crowd:

  • Strong fundamentals in algorithms and GPU programming experience (CUDA or OpenCL)

  • Understanding of NVIDIA's server and software ecosystem

  • Hands-on experience in performance optimization and benchmarking on large-scale distributed systems

  • Hands-on experience with NVIDIA GPUs, HPC storage, networking, and cloud computing.

  • In-depth understanding storage systems, Linux file systems, RDMA networking

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you.

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Create a job alert for this search

Senior Deep Learning Systems Software Engineer - AI Infrastructure • Santa Clara, CA, US

Similar jobs

Senior Software Engineer, AI Networking

Nvidia CorporationSanta Clara, CA, United States
Full-time

NVIDIA seeks a senior software engineer to join the AI Networking co‑design and benchmark R&D team.In this pivotal role, the candidate is responsible for building and productizing machine learning ... Show more

 • Promoted

Senior Deep Learning Software Engineer, Inference and Model Optimization

NVIDIA CorporationSanta Clara, CA, United States
Full-time

Our work includes conducting applied research to improve model efficiency as well as developing an innovative software platform (TRT Model Optimizer).Our software is used both internally across NVI... Show more

 • Promoted

Tech Lead Software Engineer - AI Inference Infrastructure Technology -Infrastructure San Jose [...]

ByteDanceSan Jose, California, United States
Full-time

Tech Lead Software Engineer - AI Inference Infrastructure.Design and build large-scale, container-based cluster management and orchestration systems with extreme performance, scalability, and resil... Show more

 • Promoted

Senior Software Engineer, Infrastructure Software for AI (Centralized AI Data Centers & Distrib[...]

Intelliswift - An LTTS CompanySunnyvale, CA, United States
Full-time

Senior Software Engineer, Infrastructure Software for AI (Centralized AI Data Centers & Distributed AI-RAN Environments).We challenge conventional limits by building transformative products that fu... Show more

 • Promoted

Foundation Model DevOps Engineer: Stable AI Research Infra

Institute of Foundation ModelsSunnyvale, CA, United States
Full-time

A leading research institute in California is seeking a Foundation Model DevOps Engineer to ensure operational stability in AI research.The role involves designing seamless environments for model b... Show more

 • Promoted

Senior AI/ML Systems Engineer

Google Inc.Sunnyvale, CA, United States
Full-time

A leading technology company in Sunnyvale, CA is seeking a Software Engineer III to develop next-generation technologies related to AI and ML.This role requires a Bachelor's degree and experience i... Show more

 • Promoted

Senior AI & Infrastructure Systems Engineer

GoogleSunnyvale, CA, United States
Full-time

A leading tech company is seeking a Senior Software Engineer for AI and Infrastructure.The ideal candidate will possess strong programming expertise in C++, Java, or Python, with a focus on softwar... Show more

 • Promoted

Machine Learning System Software Architect

BaiduSunnyvale, California, United States
Full-time

We’re looking forward to you joining us to collaborate, contribute, and revolutionize AI silicon and system.We’re seeking a world-class Machine Learning System Software Architect to join our SoC te... Show more

 • Promoted

Senior AI Platform Engineer – Agentic & RAG Systems

NVIDIASanta Clara, CA, United States
Full-time

A leading tech firm in Santa Clara seeks a Senior Full-Stack Software Engineer to build AI platforms enhancing business efficiency.Candidates should have over 8 years of experience in large-scale s... Show more

 • Promoted

Senior Machine Learning Infrastructure Engineer

PlusSanta Clara, CA, United States
Full-time

Plus is a global provider of highly automated driving and fully autonomous driving solutions with headquarters in Silicon Valley, California.Named by Forbes as one of America’s Best Startup Employe... Show more

 • Promoted

Senior SRE Engineer — AI-Driven Compute Platform

Apple Inc.Cupertino, CA, United States
Full-time

A global technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute infrastructure for Apple's services.The role involves developin... Show more

 • Promoted

Deep Learning Engineer

5 Star RecruitmentMountain View, CA, United States
Full-time

Mountain View, California, United States.Research in advance areas of Computer vision for COBOTS Autonomous Robots for Manufacturing Plant.The role involves designing perception systems, developing... Show more

 • Promoted

Senior AI Engineer (Remote)

Outlier AISan Jose, California, United States
Remote
Full-time

Outlier helps the world’s most innovative companies improve their AI agents by providing human feedback.We collaborate with leading AI organizations to train Large Language Models (LLMs) to functio... Show more

 • Promoted

Senior AI/ML Systems Infrastructure Engineer

AppleCupertino, CA, United States
Full-time

A leading technology company in Cupertino is seeking a Machine Learning Engineer to build infrastructure for product-focused machine learning projects.The ideal candidate will have a strong backgro... Show more

 • Promoted

Deep Learning Engineer

GenBio AIPalo Alto, CA, United States
Full-time

Join to apply for the Deep Learning Engineer role at GenBio AI.Headquartered in Silicon Valley, we are a newly established start‑up, where a collective of visionary scientists, engineers, and entre... Show more

 • Promoted

Sr. ASIC Design Engineer, Cloud-Scale Machine Learning Acceleration team – Annapurna Labs

AmazonCupertino, CA, United States
Full-time

Annapurna Labs designs silicon and software that accelerates innovation.Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago — even yesterday.... Show more

 • Promoted

AI/ML Systems Engineer - Neuron Inference & Multimodal

Amazon Web Services (AWS)Cupertino, CA, United States
Full-time

A leading cloud services provider is seeking a Software Development Engineer to join their AI/ML team.This role focuses on optimizing machine learning models and frameworks for custom ML hardware a... Show more

 • Promoted

Deep Learning Engineer, Mid-Level

Jobright.aiPalo Alto, CA, United States
Full-time

Deep Learning Engineer, Mid-Level.Be among the first 25 applicants.Deep Learning Engineer, Mid-Level.Jobright is an AI-powered career platform that helps job seekers discover the top opportunities ... Show more

 • Promoted

Senior AI Systems Engineer: Knowledge Synthesis

eGain CorporationSunnyvale, CA, United States
Full-time

A leading technology company in Sunnyvale is seeking a Senior Software Development Engineer to design and develop AI systems that transform customer operations.You will leverage cutting-edge techno... Show more

 • Promoted

Senior AI SoC Architect: Lead Next‑Gen Data Center AI

IntelSanta Clara, CA, United States
Full-time

A leading semiconductor company is seeking a Senior Computer/SoC Architect to join their AI Architecture team in Santa Clara, CA.In this role, you will lead the development of next-generation AI So... Show more