GPU Kernel Compiler Engineer, AI InferenceNVIDIA • Santa Clara, CA, United States

GPU Kernel Compiler Engineer, AI Inference

NVIDIA • Santa Clara, CA, United States

4 hours ago

Job type

Full-time

Job description

AI And Gpu Software Engineer

NVIDIA's AI and GPU software is at the forefront of computing fueling breakthroughs across deep learning, LLMs, and intelligent applications. Our team is building solutions for rapid development and deployment of GPU kernels for AI systems. We take the latest AI models, rigorously analyze them, develop and deploy high-performance GPU kernels that define model performance and integrate the derived techniques and methodologies into the tools that automate this process.

This role is a unique opportunity to shape the next generation of AI performance and efficiency. You will work hands-on with emerging AI models, collaborating across compiler, AI inference, and model performance teams. The focus is on building programming solutions that can be applied to concrete AI inference use cases to deliver real-world performance and development efficiency wins.

What You Will Be Doing :

Analyze state-of-the-art AI models, identifying key performance bottlenecks and opportunities at the kernel level.
Develop, optimize, and evaluate both hand-tuned and compiler-generated kernels for inference workloads, balancing speed and flexibility.
Design and build high-level DSLs and innovative compiler infrastructure to increase kernel developer productivity while achieving near peak performance.
Collaborate with model AI inference and compiler teams to iterate on kernel fusion, auto tuning, and sophisticated GPU programming techniques.
Benchmark performance across real workloads, diagnose root causes, and rapidly deploy optimizations that maximize hardware utilization on NVIDIA platforms.

What We Need To See :

Bachelor's, master's or PhD degree in Computer Science, Computer Engineering or related field, or equivalent experience.

At least 3+ years strong C++ and / or Python programming skills for system and performance engineering.

Understanding of GPU architecture and proficiency in CUDA programming.

Intellectual curiosity and interest to solve exciting problems and deliver practical results in production environments.

Ways To Stand Out From The Crowd :

Experience designing, developing and optimizing high-efficiency GPU kernels for modern AI workloads.

Experience building compilers, domain-specific languages, or automatic optimization systems.

Familiarity with popular compiler, GPU programming and AI frameworks such as MLIR, LLVM, PyTorch, XLA, Triton or Cutlass.

Experience with AI / ML inference workloads and model performance analysis.

Strong communication skills and ability to collaborate in a cross-team environment.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4. You will also be eligible for equity and benefits.

Create a job alert for this search

Engineer Ai Inference • Santa Clara, CA, United States

Related jobs

Senior Staff - AI Compiler Engineer (NPU)

Advanced Micro Devices, Inc. • San Jose, CA, United States

Full-time

WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next?generation computing experiencesfrom AI and data centers, to PCs, gaming and embedded syste...Show more

Last updated: 4 hours ago • Promoted • New!

Sr. Engineer, Software - AI Compiler

Tenstorrent • Santa Clara, CA, United States

Full-time +1

Engineer, Software - AI Compiler.Engineer, Software - AI Compiler.Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost ...Show more

Last updated: 4 hours ago • Promoted • New!

Applied AI Engineer — Create Production AI for Compliance

Delve • San Francisco, CA, United States

Full-time

A fast-growing AI compliance company in San Francisco is seeking an Applied AI Engineer to innovate and build scalable AI systems. You will collaborate with a talented team, design and deploy AI-dri...Show more

Last updated: 1 day ago • Promoted

Machine Learning GPU Performance Engineer

Google • Mountain View, CA, United States

Full-time

Machine Learning GPU Performance Engineer.Google’s software engineers are at the forefront of developing next-generation technologies that impact billions of users globally.We go beyond web search,...Show more

Last updated: 30+ days ago • Promoted

Apple Pre-Silicon GPU Compiler Backend Engineer

Apple • Santa Clara, CA, United States

Full-time

Apple Pre-Silicon GPU Compiler Backend Engineer.As a member of the AGX Pre-Silicon backend team, you will design and implement significant parts of the compiler for future Apple GPUs.You will be a ...Show more

Last updated: 4 hours ago • Promoted • New!

Machine Learning - Compiler Engineer II, Annapurna Labs

Amazon Web Services (AWS) • Cupertino, CA, United States

Full-time

The Product : AWS Machine Learning accelerators are at the forefront of AWS innovation and one of several AWS tools used for building Generative AI on AWS. The Inferentia chip delivers best-in-class ...Show more

Last updated: 30+ days ago • Promoted

Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs

Amazon Web Services (AWS) • Cupertino, CA, United States

Full-time

Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs.Do you want to be part of AI revolution? At AWS our vision is to make deep learning pervasive for everyday developers and to demo...Show more

Last updated: 30+ days ago • Promoted

Senior AI and ML HPC Cluster Engineer

NVIDIA • Santa Clara, CA, United States

Full-time

NVIDIA has continuously reinvented itself over two decades.Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parall...Show more

Last updated: 1 day ago • Promoted

Sr. Manufacturing Engineer (Aircraft and Maintenance)

Reliable Robotics • San Martin, CA, United States

Permanent

We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more

Last updated: 30+ days ago • Promoted

Compiler Engineer, LLVM

NVIDIA • Santa Clara, CA, United States

Full-time

We are looking for an enthusiastic LLVM Compiler Engineer for an exciting and fun role in our GPU Software organization.We deliver features and improvements to better realize the potential of NVIDI...Show more

Last updated: 1 day ago • Promoted

Founding Engineer (Systems + ML)

Partcl • San Francisco, CA, United States

Full-time

Founding Engineer (Systems + ML).Get AI-powered advice on this job and more exclusive features.This range is provided by Partcl. Your actual pay will be based on your skills and experience — talk wi...Show more

Last updated: 30+ days ago • Promoted

Senior AI Performance and Efficiency Engineer

NVIDIA • Santa Clara, CA, United States

Full-time

We are seeking a Senior AI / ML Performance and Efficiency Engineer, GPU Clusters at NVIDIA to join our AI Efficiency efforts. As an Engineer, you will have a pivotal role in enhancing efficiency for ...Show more

Last updated: 1 day ago • Promoted

Apple GPU Compiler Backend / Research Engineer

Apple • Cupertino, CA, United States

Full-time

In this role, you will have the opportunity to work on defining the roadmap for Apple GPU architecture and unleash the potential for new applications. You will work cross-functionally with several h...Show more

Last updated: 30+ days ago • Promoted

AI Inference Engineer

quadric.io • Burlingame, CA, United States

Full-time

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...Show more

Last updated: 14 hours ago • Promoted • New!

AI Compiler Engineer

Intel • San Jose, CA, United States

Full-time

We are seeking a highly skilled Compiler Engineer with experience in MLIR (Multi-Level Intermediate Representation) and performance-critical code generation. The ideal candidate will focus on design...Show more

Last updated: 30+ days ago • Promoted

AI Kernel Engineer

quadric, Inc • Burlingame, CA, US

Full-time

Quick Apply

Last updated: 1 day ago

Model Inference Engineer for High-Performance AI

OpenAI • San Francisco, CA, United States

Full-time

A technology research company in San Francisco is seeking a Software Engineer for Model Inference to optimize AI models for production environments. The ideal candidate will have over 5 years of exp...Show more

Last updated: 9 hours ago • Promoted • New!

Machine Learning Compiler Engineer, Annapurna Labs

Amazon Web Services (AWS) • Cupertino, CA, United States

Full-time

Machine Learning Compiler Engineer, Annapurna Labs.Machine Learning Compiler Engineer, Annapurna Labs.Get AI-powered advice on this job and more exclusive features. The AWS Neuron Compiler team is a...Show more

Last updated: 30+ days ago • Promoted