Principal Deep Learning Software Engineer, LLM Performance

NVIDIA CorporationSanta Clara, CA, United States

20 hours ago

Job type

Full-time

Job description

We are now looking for a Principal Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance of LLM inference! NVIDIA is rapidly growing our research and development for Deep Learning Inference and is seeking excellent Software Engineers at all levels of expertise to join our team. Companies around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in areas like LLM, Generative AI, Recommenders and Vision that have put DL into every software solution. Join the team that builds the software to enable the performance optimization, deployment and serving of these DL solutions. We specialize in developing GPU-accelerated Deep learning software like TensorRT, DL benchmarking software and performant solutions to deploy and serve these models.Collaborate with the deep learning community to implement the latest algorithms for public release in TensorRT LLM, VLLM, SGLang and LLM benchmarks. Identify performance opportunities and optimize SoTA LLM models across the spectrum of NVIDIA accelerators, from datacenter GPUs to edge SoCs. Implement LLM inference, serving and deployment algorithms and optimizations using TensorRT LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with a diverse set of teams involving performance modeling, performance analysis, kernel development and inference software development.

What you'll be doing :
Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment in NVIDIA / OSS LLM frameworks.
Scale performance of LLM models across different architectures and types of NVIDIA accelerators.
Scale performance for max throughput, minimum latency and throughput under latency constraints.
Contribute features and code to NVIDIA / OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton.
Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.
What we need to see :
Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI).
At least 12 years of relevant software development experience.
Excellent Python / C / C++ programming, software design and software engineering skills
Experience with a DL framework like PyTorch, JAX, TensorFlow.
Ways to stand out from the crowd :
Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation
Prior experience with performance modeling, profiling, debug, and code optimization of a DL / HPC / high-performance application
Architectural knowledge of CPU and GPU
GPU programming experience (CUDA or OpenCL)GPU deep learning has provided the foundation for machines to learn, perceive, reason and solve problems posed using human language. The GPU started out as the engine for simulating human imagination, conjuring up the amazing virtual worlds of video games and Hollywood films. Now, NVIDIA's GPU runs deep learning algorithms, simulating human intelligence, and acts as the brain of computers, robots and self-driving cars that can perceive and understand the world. Just as human imagination and intelligence are linked, computer graphics and artificial intelligence come together in our architecture. Two modes of the human brain, two modes of the GPU. This may explain why NVIDIA GPUs are used broadly for deep learning, and NVIDIA is increasingly known as “the AI computing company.” Come, join our DL Architecture team, where you can help build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.#LI-HybridYour base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 425,500 USD.You will also be eligible for equity and .Applications for this job will be accepted at least until July 29, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

#J-18808-Ljbffr

Create a job alert for this search

Principal Software Engineer • Santa Clara, CA, United States

Related jobs

Promoted

Principal Machine Learning Engineer, Monetization

PinterestSan Francisco, CA, United States

Full-time

Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show moreLast updated: 21 days ago

Promoted
New!

Principal Machine Learning Engineer

SAPPalo Alto, CA, United States

Full-time

We are seeking a highly skilled and driven.Principal Machine Learning Engineer.AI and large language model (LLM) capabilities. In this role, you will shape cutting-edge infrastructure, mentor world-...Show moreLast updated: 20 hours ago

Promoted
New!

Principal Machine Learning Engineer

TubiSan Francisco, CA, United States

Full-time

About Tubi : Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and ...Show moreLast updated: 20 hours ago

Promoted
New!

Machine Learning Engineer, Principal - Frameworks

d-MatrixSanta Clara, CA, United States

Full-time

AI to power the transformation of technology.We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. We value humility and believe in direct communic...Show moreLast updated: 20 hours ago

Promoted

Principal Machine Learning Engineer, Trust & Safety

PinterestSan Francisco, CA, United States

Full-time

Promoted
New!

Machine Learning Engineer - GenAI, LLM, Agentic AI

CerebrasSanta Clara, CA, United States

Full-time

We are building the next generation of our AI-powered talent platform, aiming to match the right career for everyone in the world. Our AI-native enterprise talent intelligence platform leverages Gen...Show moreLast updated: 20 hours ago

Promoted

Machine Learning Engineering Manager

VirtualVocationsHayward, California, United States

Full-time

A company is looking for a Manager of Machine Learning Engineering.Key Responsibilities Build and lead a high-performing team of ML Engineers, fostering a culture of technical excellence Contrib...Show moreLast updated: 30+ days ago

Promoted
New!

Principal Machine Learning Engineer

Black OreSan Francisco, CA, United States

Full-time

Black Ore is building the leading AI platform for financial services.By combining LLMs, proprietary AI / ML and automation we accelerate core workflows for the industry, allow financial services prof...Show moreLast updated: 20 hours ago

Promoted

Senior Deep Learning Engineer

VirtualVocationsSanta Clara, California, United States

Full-time

A company is looking for a Senior Deep Learning Software Engineer - Autonomous Vehicles.Key Responsibilities Train, fine-tune, optimize, and customize perception DNNs in low precision (FP16 / INT8)...Show moreLast updated: 30+ days ago

Promoted
New!

Principal Machine Learning Engineer

General MotorsSunnyvale, CA, United States

Full-time

We are seeking a Principal AI Engineer to lead the design and advancement of our AI platform.You will play a key role in shaping the infrastructure that powers large-scale training and cloud infere...Show moreLast updated: 20 hours ago

Promoted

Principal Machine Learning Services Engineer, Firefly Enterprise

Adobe Systems GmbHSan Jose, CA, United States

Full-time

Principal Machine Learning Engineer.In this high-impact role, you will lead a team of talented engineers in building scalable, high-performance generative AI systems—powering features across Adobe ...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Engineer

VirtualVocationsFremont, California, United States

Full-time

A company is looking for a Machine Learning Engineer in South San Francisco, CA.Key Responsibilities Manage projects deploying machine learning techniques for molecular optimization in drug desig...Show moreLast updated: 30+ days ago

Promoted
New!

Principal Machine Learning Engineer

SAP SEPalo Alto, CA, United States

Full-time +1

We help the world run better At SAP, we keep it simple : you bring your best to us, and we'll bring out the best in you.We're builders touching over 20 industries and 80% of global commerce, and we ...Show moreLast updated: 20 hours ago

Promoted
New!

Machine Learning Engineer, GenAI & LLM - AiDP - IS&T

Apple Inc.Sunnyvale, CA, United States

Full-time

Machine Learning Engineer, GenAI & LLM - AiDP - IS&T.Sunnyvale, California, United States Corporate Functions.As a pivotal member of Apple’s enterprise generative AI efforts, you will : - Innovate tr...Show moreLast updated: 20 hours ago

Promoted
New!

Principal Machine Learning Engineer

Tubi TvSan Francisco, CA, United States

Full-time

Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users.Tubi offers the world's largest collection of Hollywood movies and TV shows, th...Show moreLast updated: 20 hours ago

Promoted

Machine Learning Operations (MLOps) Engineer

Together AISan Francisco, CA, United States

Full-time

Together AI is looking for an MLOps engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime sy...Show moreLast updated: 30+ days ago

Promoted

Principal Machine Learning Engineer, ML Inference Platform, Level 7

Snap Inc.Palo Alto, CA, United States

Full-time

We believe the camera presents the greatest opportunity to improve the way people live and communicate.Snap contributes to human progress by empowering people to express themselves, live in the mom...Show moreLast updated: 30+ days ago

Promoted
New!

Principle Machine Learning Engineer

AtlassianSan Francisco, CA, United States

Full-time

Atlassians can choose where they work — whether in an office, from home, or a combination of the two.That way, Atlassians have more control over supporting their family, personal goals, and other p...Show moreLast updated: 20 hours ago