Talent.com
No longer accepting applications
Solutions Architect, Inference Deployments

Solutions Architect, Inference Deployments

NVIDIASanta Clara, CA, United States
5 days ago
Job type
  • Full-time
Job description

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you’ll collaborate closely with our engineering, DevOps, and customer success teams to foster enterprise AI adoption. Together, we'll introduce generative AI to production!

What you'll be doing :

Help customers craft, deploy, and maintain scalable, GPU-accelerated inference pipelines on Kubernetes for large language models (LLMs) and generative AI workloads.

Enhance performance tuning using TensorRT / TensorRT-LLM, NVIDIA NIM, and Triton Inference Server to improve GPU utilization and model efficiency.

Collaborate with multi-functional teams (engineering, product) and offer technical mentorship to customers implementing AI at scale.

Architect zero-downtime deployments, autoscaling (e.g., HPA or equivalent experience with custom metrics), and integration with cloud-native tools (e.g., OpenTelemetry, Prometheus, Grafana).

What we need to see :

5+ Years in Solutions Architecture with a proven track record of moving AI inference from POC to production on Kubernetes.

Experience architecting GPU allocation using NVIDIA GPU Operator and NVIDIA NIM Operator. Troubleshoot sophisticated GPU orchestration, optimize with Multi-Instance GPU (MIG), and ensure efficient utilization in Kubernetes environments.

Proficiency with TensorRT-LLM, Triton, and TensorRT for model optimization and serving.

Success stories optimizing LLMs for low-latency inference in enterprise environments.

BS or equivalent experience in CS / Engineering.

Ways to stand out from the crowd :

Prior experience deploying NVIDIA NIM microservices for multi-model inference.

Serverless Inference, knowledge of FaaS patterns (e.g., Google Cloud Run, AWS Lambda, NVCF) with NVIDIA GPUs.

NVIDIA Certified AI Engineer or similar.

Active contributions to Kubernetes SIGs or AI inference projects (e.g., KServe, Dynamo, SGLang or similar).

Familiarity with networking concepts which support multi-node inference such as MPI, LWS or similar.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD.

You will also be eligible for equity and benefits () .

Applications for this job will be accepted at least until August 30, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Create a job alert for this search

Solution Architect • Santa Clara, CA, United States

Related jobs
  • Promoted
Solutions Architect

Solutions Architect

OpenGovSan Francisco, CA, United States
Full-time
OpenGov is the leader in AI and ERP solutions for local and state governments in the U.More than 2,000 cities, counties, state agencies, school districts, and special districts rely on the OpenGov ...Show moreLast updated: 30+ days ago
  • Promoted
Sr. Solution Architect

Sr. Solution Architect

SupermicroSan Jose, CA, United States
Full-time
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect, DGX Cloud

Solutions Architect, DGX Cloud

NVIDIA CorporationSanta Clara, CA, United States
Full-time
NVIDIA DGX Cloud is an AI platform for developers, researchers, and enterprises, optimized for the demands of Generative AI. The DGX Cloud SA team is dedicated to shaping the future of DGX Cloud by ...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

TectonSan Francisco, CA, United States
Full-time
Tecton’s feature platform makes it simple to activate data for smarter models and predictions, abstracting away the complex engineering to speed up innovation. Tecton’s founders developed the first....Show moreLast updated: 30+ days ago
  • Promoted
Principal Solutions Architect - Observability

Principal Solutions Architect - Observability

ElasticMountain View, CA, United States
Full-time
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

CascadingSan Francisco, CA, United States
Full-time
Casca is building AGI for banking.We’re replacing decades-old legacy systems with AI-native technology that automates 90% of the manual work humans once had to do. Architect the Future of AI-Driven ...Show moreLast updated: 23 days ago
  • Promoted
Solution Architect (Azure DataOps)

Solution Architect (Azure DataOps)

HCLTechSan Francisco, CA, United States
Full-time
Get AI-powered advice on this job and more exclusive features.Direct message the job poster from HCLTech.HCLTech is looking for a highly talented and self- motivated. Solution Architect) to join it ...Show moreLast updated: 30+ days ago
  • Promoted
Anaplan Solution Architect

Anaplan Solution Architect

AnaplanSan Francisco, CA, United States
Full-time
At Anaplan, we are a team of innovators focused on optimizing business decision-making through our leading AI-infused scenario planning and analysis platform so our customers can outpace their comp...Show moreLast updated: 2 days ago
  • Promoted
Solutions Architect

Solutions Architect

BlubizPalo Alto, CA, United States
Full-time
BluBiz Solutions is seeking a technically proficient and ambitious Solutions Architect with hands-on expertise in networking, cybersecurity, and cloud technologies. This dynamic role is suited to so...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

IntercomSan Francisco, CA, United States
Full-time
Intercom is the AI Customer Service company on a mission to help businesses provide incredible customer experiences.Our AI agent Fin, the most advanced customer service AI agent on the market, lets...Show moreLast updated: 3 days ago
  • Promoted
AI / ML Solutions Architect

AI / ML Solutions Architect

JobotSan Jose, CA, US
Full-time
Join one of the fasted growing AI / ML services companies!.This Jobot Job is hosted by : Adam Bennett.Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume.Salary...Show moreLast updated: 4 days ago
  • Promoted
Sr. Solution Architect - Enterprise

Sr. Solution Architect - Enterprise

SupermicroSan Jose, CA, United States
Full-time
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 30+ days ago
Solutions Architect

Solutions Architect

Nue.io CareersSan Mateo, California, .US
Full-time
Quick Apply
We believe that the right way to accelerate business results is by giving go-to-market teams.ANY revenue model, across ANY channel, with complete and accurate revenue visibility end-to-end.This req...Show moreLast updated: 12 days ago
  • Promoted
Solutions Architect

Solutions Architect

7wdataSanta Clara, CA, United States
Full-time
We are looking for a Machine Learning Engineer / Solution Architect with experience in deploying Machine Learning (ML), Deep Learning (DL) models on prem and in the cloud. As part of the Solution Arch...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

Virtue AISan Francisco, CA, United States
Full-time
Virtue AI is at the forefront of AI security and compliance.As enterprises increasingly adopt Large Language Models, building AI applications such as chatbots and agents, the need for robust, trust...Show moreLast updated: 2 days ago
  • Promoted
Lead Solutions Architect

Lead Solutions Architect

Citrix Systems IncPalo Alto, CA, United States
Full-time
We are seeking a Lead Solutions Architect to join our mission of creating the modern and secure developer workplace, where efficiency, security, and innovation converge. This role focuses on shaping...Show moreLast updated: 30+ days ago
  • Promoted
Senior Solutions Consultant

Senior Solutions Consultant

AnaplanSan Ramon, CA, United States
Full-time
At Anaplan, we are a team of innovators focused on optimizing business decision-making through our leading AI-infused scenario planning and analysis platform so our customers can outpace their comp...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect, Generative AI Deployment

Solutions Architect, Generative AI Deployment

OpenAISan Francisco, CA, United States
Full-time
Solutions Architect, Generative AI Deployment.The Solutions Architecture team ensures the safe and effective deployment of Generative AI applications for developers and enterprises.We act as truste...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

Stefanini North America and APACSan Francisco, CA, United States
Full-time
Stefanini is looking for a Solutions Architect in Various Locations Across USA (Hybrid).As a Senior Solution Architect, you will design comprehensive data and analytics solutions leveraging cutting...Show moreLast updated: 2 days ago
  • Promoted
Azure Solution Architect

Azure Solution Architect

Technology Credit UnionSan Jose, CA, United States
Full-time
The Azure Solution Architect performs design work for large and complex software projects, and the interfaces between them. This position drives adherence to standards, processes, and policies and c...Show moreLast updated: 30+ days ago