Talent.com
GPU Kernel Compiler Engineer, AI Inference
GPU Kernel Compiler Engineer, AI InferenceNVIDIA • Santa Clara, CA, United States
GPU Kernel Compiler Engineer, AI Inference

GPU Kernel Compiler Engineer, AI Inference

NVIDIA • Santa Clara, CA, United States
4 hours ago
Job type
  • Full-time
Job description

AI And Gpu Software Engineer

NVIDIA's AI and GPU software is at the forefront of computing fueling breakthroughs across deep learning, LLMs, and intelligent applications. Our team is building solutions for rapid development and deployment of GPU kernels for AI systems. We take the latest AI models, rigorously analyze them, develop and deploy high-performance GPU kernels that define model performance and integrate the derived techniques and methodologies into the tools that automate this process.

This role is a unique opportunity to shape the next generation of AI performance and efficiency. You will work hands-on with emerging AI models, collaborating across compiler, AI inference, and model performance teams. The focus is on building programming solutions that can be applied to concrete AI inference use cases to deliver real-world performance and development efficiency wins.

What You Will Be Doing :

  • Analyze state-of-the-art AI models, identifying key performance bottlenecks and opportunities at the kernel level.
  • Develop, optimize, and evaluate both hand-tuned and compiler-generated kernels for inference workloads, balancing speed and flexibility.
  • Design and build high-level DSLs and innovative compiler infrastructure to increase kernel developer productivity while achieving near peak performance.
  • Collaborate with model AI inference and compiler teams to iterate on kernel fusion, auto tuning, and sophisticated GPU programming techniques.
  • Benchmark performance across real workloads, diagnose root causes, and rapidly deploy optimizations that maximize hardware utilization on NVIDIA platforms.

What We Need To See :

  • Bachelor's, master's or PhD degree in Computer Science, Computer Engineering or related field, or equivalent experience.
  • At least 3+ years strong C++ and / or Python programming skills for system and performance engineering.
  • Understanding of GPU architecture and proficiency in CUDA programming.
  • Intellectual curiosity and interest to solve exciting problems and deliver practical results in production environments.
  • Ways To Stand Out From The Crowd :

  • Experience designing, developing and optimizing high-efficiency GPU kernels for modern AI workloads.
  • Experience building compilers, domain-specific languages, or automatic optimization systems.
  • Familiarity with popular compiler, GPU programming and AI frameworks such as MLIR, LLVM, PyTorch, XLA, Triton or Cutlass.
  • Experience with AI / ML inference workloads and model performance analysis.
  • Strong communication skills and ability to collaborate in a cross-team environment.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4. You will also be eligible for equity and benefits.

    Create a job alert for this search

    Engineer Ai Inference • Santa Clara, CA, United States

    Related jobs
    Senior Staff - AI Compiler Engineer (NPU)

    Senior Staff - AI Compiler Engineer (NPU)

    Advanced Micro Devices, Inc. • San Jose, CA, United States
    Full-time
    WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next?generation computing experiencesfrom AI and data centers, to PCs, gaming and embedded syste...Show more
    Last updated: 4 hours ago • Promoted • New!
    Sr. Engineer, Software - AI Compiler

    Sr. Engineer, Software - AI Compiler

    Tenstorrent • Santa Clara, CA, United States
    Full-time +1
    Engineer, Software - AI Compiler.Engineer, Software - AI Compiler.Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost ...Show more
    Last updated: 4 hours ago • Promoted • New!
    Applied AI Engineer — Create Production AI for Compliance

    Applied AI Engineer — Create Production AI for Compliance

    Delve • San Francisco, CA, United States
    Full-time
    A fast-growing AI compliance company in San Francisco is seeking an Applied AI Engineer to innovate and build scalable AI systems. You will collaborate with a talented team, design and deploy AI-dri...Show more
    Last updated: 1 day ago • Promoted
    Machine Learning GPU Performance Engineer

    Machine Learning GPU Performance Engineer

    Google • Mountain View, CA, United States
    Full-time
    Machine Learning GPU Performance Engineer.Google’s software engineers are at the forefront of developing next-generation technologies that impact billions of users globally.We go beyond web search,...Show more
    Last updated: 30+ days ago • Promoted
    Apple Pre-Silicon GPU Compiler Backend Engineer

    Apple Pre-Silicon GPU Compiler Backend Engineer

    Apple • Santa Clara, CA, United States
    Full-time
    Apple Pre-Silicon GPU Compiler Backend Engineer.As a member of the AGX Pre-Silicon backend team, you will design and implement significant parts of the compiler for future Apple GPUs.You will be a ...Show more
    Last updated: 4 hours ago • Promoted • New!
    Machine Learning - Compiler Engineer II, Annapurna Labs

    Machine Learning - Compiler Engineer II, Annapurna Labs

    Amazon Web Services (AWS) • Cupertino, CA, United States
    Full-time
    The Product : AWS Machine Learning accelerators are at the forefront of AWS innovation and one of several AWS tools used for building Generative AI on AWS. The Inferentia chip delivers best-in-class ...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs

    Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs

    Amazon Web Services (AWS) • Cupertino, CA, United States
    Full-time
    Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs.Do you want to be part of AI revolution? At AWS our vision is to make deep learning pervasive for everyday developers and to demo...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI and ML HPC Cluster Engineer

    Senior AI and ML HPC Cluster Engineer

    NVIDIA • Santa Clara, CA, United States
    Full-time
    NVIDIA has continuously reinvented itself over two decades.Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parall...Show more
    Last updated: 1 day ago • Promoted
    Sr. Manufacturing Engineer (Aircraft and Maintenance)

    Sr. Manufacturing Engineer (Aircraft and Maintenance)

    Reliable Robotics • San Martin, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more
    Last updated: 30+ days ago • Promoted
    Compiler Engineer, LLVM

    Compiler Engineer, LLVM

    NVIDIA • Santa Clara, CA, United States
    Full-time
    We are looking for an enthusiastic LLVM Compiler Engineer for an exciting and fun role in our GPU Software organization.We deliver features and improvements to better realize the potential of NVIDI...Show more
    Last updated: 1 day ago • Promoted
    Founding Engineer (Systems + ML)

    Founding Engineer (Systems + ML)

    Partcl • San Francisco, CA, United States
    Full-time
    Founding Engineer (Systems + ML).Get AI-powered advice on this job and more exclusive features.This range is provided by Partcl. Your actual pay will be based on your skills and experience — talk wi...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Performance and Efficiency Engineer

    Senior AI Performance and Efficiency Engineer

    NVIDIA • Santa Clara, CA, United States
    Full-time
    We are seeking a Senior AI / ML Performance and Efficiency Engineer, GPU Clusters at NVIDIA to join our AI Efficiency efforts. As an Engineer, you will have a pivotal role in enhancing efficiency for ...Show more
    Last updated: 1 day ago • Promoted
    Apple GPU Compiler Backend / Research Engineer

    Apple GPU Compiler Backend / Research Engineer

    Apple • Cupertino, CA, United States
    Full-time
    In this role, you will have the opportunity to work on defining the roadmap for Apple GPU architecture and unleash the potential for new applications. You will work cross-functionally with several h...Show more
    Last updated: 30+ days ago • Promoted
    AI Inference Engineer

    AI Inference Engineer

    quadric.io • Burlingame, CA, United States
    Full-time
    Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...Show more
    Last updated: 14 hours ago • Promoted • New!
    AI Compiler Engineer

    AI Compiler Engineer

    Intel • San Jose, CA, United States
    Full-time
    We are seeking a highly skilled Compiler Engineer with experience in MLIR (Multi-Level Intermediate Representation) and performance-critical code generation. The ideal candidate will focus on design...Show more
    Last updated: 30+ days ago • Promoted
    AI Kernel Engineer

    AI Kernel Engineer

    quadric, Inc • Burlingame, CA, US
    Full-time
    Quick Apply
    Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...Show more
    Last updated: 1 day ago
    Model Inference Engineer for High-Performance AI

    Model Inference Engineer for High-Performance AI

    OpenAI • San Francisco, CA, United States
    Full-time
    A technology research company in San Francisco is seeking a Software Engineer for Model Inference to optimize AI models for production environments. The ideal candidate will have over 5 years of exp...Show more
    Last updated: 9 hours ago • Promoted • New!
    Machine Learning Compiler Engineer, Annapurna Labs

    Machine Learning Compiler Engineer, Annapurna Labs

    Amazon Web Services (AWS) • Cupertino, CA, United States
    Full-time
    Machine Learning Compiler Engineer, Annapurna Labs.Machine Learning Compiler Engineer, Annapurna Labs.Get AI-powered advice on this job and more exclusive features. The AWS Neuron Compiler team is a...Show more
    Last updated: 30+ days ago • Promoted