Talent.com
Solutions Architect, Inference Deployments

Solutions Architect, Inference Deployments

NVIDIASanta Clara, CA, United States
30+ days ago
Job type
  • Full-time
Job description

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you’ll collaborate closely with our engineering, DevOps, and customer success teams to foster enterprise AI adoption. Together, we'll introduce generative AI to production!

What you'll be doing :

Help customers craft, deploy, and maintain scalable, GPU-accelerated inference pipelines on Kubernetes for large language models (LLMs) and generative AI workloads.

Enhance performance tuning using TensorRT / TensorRT-LLM, NVIDIA NIM, and Triton Inference Server to improve GPU utilization and model efficiency.

Collaborate with multi-functional teams (engineering, product) and offer technical mentorship to customers implementing AI at scale.

Architect zero-downtime deployments, autoscaling (e.g., HPA or equivalent experience with custom metrics), and integration with cloud-native tools (e.g., OpenTelemetry, Prometheus, Grafana).

What we need to see :

5+ Years in Solutions Architecture with a proven track record of moving AI inference from POC to production on Kubernetes.

Experience architecting GPU allocation using NVIDIA GPU Operator and NVIDIA NIM Operator. Troubleshoot sophisticated GPU orchestration, optimize with Multi-Instance GPU (MIG), and ensure efficient utilization in Kubernetes environments.

Proficiency with TensorRT-LLM, Triton, and TensorRT for model optimization and serving.

Success stories optimizing LLMs for low-latency inference in enterprise environments.

BS or equivalent experience in CS / Engineering.

Ways to stand out from the crowd :

Prior experience deploying NVIDIA NIM microservices for multi-model inference.

Serverless Inference, knowledge of FaaS patterns (e.g., Google Cloud Run, AWS Lambda, NVCF) with NVIDIA GPUs.

NVIDIA Certified AI Engineer or similar.

Active contributions to Kubernetes SIGs or AI inference projects (e.g., KServe, Dynamo, SGLang or similar).

Familiarity with networking concepts which support multi-node inference such as MPI, LWS or similar.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD.

You will also be eligible for equity and benefits () .

Applications for this job will be accepted at least until November 25, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Create a job alert for this search

Solution Architect • Santa Clara, CA, United States

Related jobs
  • Promoted
Workday Solution Architect

Workday Solution Architect

HCLTechfremont, CA, US
Full-time
We are looking for a highly talented and self- motivated Workday Solution Architect to join us on our journey in advancing the technological world through innovation and creativity.Job Title : Workd...Show moreLast updated: 23 days ago
  • Promoted
Solutions Architect

Solutions Architect

CascaSan Francisco, CA, United States
Full-time
Casca is building AGI for banking.We’re replacing decades-old legacy systems with AI-native technology that automates 90% of the manual work humans once had to do. Architect the Future of AI-Driven ...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect, Startups

Solutions Architect, Startups

StripeSan Francisco, CA, United States
Full-time
Stripe is a financial infrastructure platform for businesses.Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their reve...Show moreLast updated: 15 days ago
  • Promoted
  • New!
Solutions Architect

Solutions Architect

Meta PlatformsMenlo Park, CA, United States
Full-time
Our team is looking for a high performance Solutions Architect with an entrepreneurial mindset.As a Solutions Architect, you will lead the advertising industry by enabling clients to realize the fu...Show moreLast updated: 16 hours ago
  • Promoted
Principal Solutions Architect - Observability

Principal Solutions Architect - Observability

ElasticMountain View, CA, United States
Full-time
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect, Enterprise Platforms

Solutions Architect, Enterprise Platforms

StripeSan Francisco, CA, United States
Full-time
At Stripe, you have an unprecedented opportunity to put the global economy within everyone’s reach.Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’...Show moreLast updated: 20 days ago
  • Promoted
Solutions Architect

Solutions Architect

IntercomSan Francisco, CA, United States
Full-time
Intercom is the AI Customer Service company on a mission to help businesses provide incredible customer experiences.Our AI agent Fin, the most advanced customer service AI agent on the market, lets...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

Contextual AIMountain View, CA, United States
Full-time
About Contextual AI : We’re revolutionizing how AI Agents work by solving AI's most critical challenge : context.The right context at the right time unlocks the accuracy and production scale that ent...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

BlubizPalo Alto, CA, United States
Full-time
BluBiz Solutions is seeking a technically proficient and ambitious Solutions Architect with hands-on expertise in networking, cybersecurity, and cloud technologies. This dynamic role is suited to so...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

Strategic Employment Partners (SEP)San Francisco, CA, United States
Full-time
Strategic Employment Partners (SEP) provided pay range.This range is provided by Strategic Employment Partners (SEP).Your actual pay will be based on your skills and experience — talk with your rec...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

LangfuseSan Francisco, CA, United States
Full-time
Open Source LLM Engineering Platform that helps teams build useful AI applications via tracing, evaluation, and prompt management (mission, product). We have the chance to build the "Datadog" of thi...Show moreLast updated: 2 days ago
  • Promoted
Solutions Architect

Solutions Architect

Nue.ioSan Mateo, CA, United States
Full-time
As the Solution Architect, you will lead discovery, translate business requirements into scalable solution designs, guide configuration and integration, and serve as the technical expert throughout...Show moreLast updated: 3 days ago
  • Promoted
Solution Architect - Presales

Solution Architect - Presales

Informatica LLCRedwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

DatabricksSan Francisco, CA, United States
Full-time
While candidates in the listed locations are encouraged for this role, we are open to remote candidates in other locations in various cities around the US. We are seeking experienced pre‑sales profe...Show moreLast updated: 30+ days ago
  • Promoted
Presales Solution Architect

Presales Solution Architect

Informatica LLCRedwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous, work-from-anywhere minds eager to so...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect, Startups

Solutions Architect, Startups

OpenAISan Francisco, CA, United States
Full-time
The Solutions Architecture team is responsible for ensuring the safe and effective deployment of Generative AI applications for developers and enterprises. We act as a trusted advisor and thought pa...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect - Cloud Providers and Hyperscale

Solutions Architect - Cloud Providers and Hyperscale

NVIDIA CorporationSanta Clara, CA, United States
Full-time
Solutions Architect - Cloud Providers and Hyperscale.Solutions Architect - Cloud Providers and Hyperscale.We are now looking for a Solutions Architect! NVIDIA is searching for Solutions Architect w...Show moreLast updated: 30+ days ago
  • Promoted
Solutions Architect

Solutions Architect

Applied Intuition Inc.Mountain View, CA, United States
Full-time
Applied Intuition is the vehicle intelligence company that accelerates the global adoption of safe, AI-driven machines.Founded in 2017 and now valued at $15 billion following its recent Series F fu...Show moreLast updated: 30+ days ago