Talent.com
Senior AI Infrastructure Engineer, Cloud Partnerships - DGX Cloud

Senior AI Infrastructure Engineer, Cloud Partnerships - DGX Cloud

NVIDIASanta Clara, CA, United States
1 day ago
Job type
  • Full-time
Job description

Senior AI Infrastructure Engineer, Cloud Partnerships - DGX Cloud page is loaded## Senior AI Infrastructure Engineer, Cloud Partnerships - DGX Cloudlocations : US, CA, Santa Clara : US, Remotetime type : Full timeposted on : Posted Yesterdayjob requisition id : JR2006692We are seeking a AI Infrastructure Engineer to integrate third-party infrastructure partners into NVIDIAs operational excellence programs. This cross-functional role is dedicated to developing foundational frameworks for the proactive management of availability across our diverse infrastructure through API integrations and process alignment. The ideal candidate should possess experience in delivering production infrastructure across various cloud providers, including hands-on experience in building and managing this infrastructure, as well as managing vendor relationships. You will partner with engineering, SRE, product, and third-party infrastructure providers to achieve operational excellence.

  • What youll be doing :
  • Architect unified systems for integrating infrastructure provider maintenance events into NVIDIA engineering systems
  • Drive the adoption of operational excellence best practices across all infrastructure providers, partnering with SRE, infra, product, and security teams
  • Define and operationalize governance models for engineering support engagements, infrastructure maintenance lifecycles, and incident escalation paths
  • Measure provider availability against projected maintenance schedules using Service Level Objectives (SLOs)
  • Collaborate with AI / ML teams to integrate intelligent automation into maintenance workflows, such as projecting job capacity impact based on scheduled resource availability and suggesting infrastructure reallocations for high-profile initiatives
  • Develop a long-term roadmap to guide infrastructure providers in progressively adopting best practices for reliability and production hygiene across existing and new product introductions
  • What we need to see :
  • 8+ years of experience in infrastructure architecture, cloud native, or large-scale platform / reliability roles
  • Bachelor's degree or equivalent experience
  • Experience designing scalable, maintainable backend systems and writing clear design documentation
  • Strong understanding of multiple cloud infrastructure provider resource offerings
  • Demonstrated experience in normalizing and unifying diverse data sources from a variety of systems into broadly applicable schemas, enabling efficient querying and analysis
  • Proven ability to lead and influence cross-functional technical initiatives at scale across vendors and external partners, especially in reliability or platform domains
  • Demonstrated ability to design and implement maintainable APIs for internal and external customers
  • Proficiency in Kubernetes administration, modern CI / CD techniques and Infrastructure as Code (IaC)
  • Experience building resilient production systems using Golang, Python or Ruby
  • Ways to stand out from the crowd :
  • Proven experience in operating or architecting production systems across multiple cloud and bare-metal infrastructure providers.
  • Proficiency in data science tools such as Spark, Delta Lake, and Databricks.
  • Hands-on production experience with workflow engines like Temporal, Argo, or similar platforms for durable workflow execution.NVIDIA is widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you are creative and autonomous, we want to hear from you!Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and .Applications for this job will be accepted at least until October 27, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

#J-18808-Ljbffr

Create a job alert for this search

Cloud Infrastructure Engineer • Santa Clara, CA, United States

Related jobs
  • Promoted
Solutions Architect, DGX Cloud

Solutions Architect, DGX Cloud

NVIDIA CorporationSanta Clara, CA, United States
Full-time
NVIDIA DGX Cloud is an AI platform for developers, researchers, and enterprises, optimized for the demands of Generative AI. The DGX Cloud SA team is dedicated to shaping the future of DGX Cloud by ...Show moreLast updated: 30+ days ago
  • Promoted
Cloud Engineer / Gcp (Arch / Design)

Cloud Engineer / Gcp (Arch / Design)

Insight GlobalSan Jose, CA, United States
Full-time
Were looking for hands-on Cloud Architects with deep expertise in GCP and enterprise cloud architecture.Youll collaborate closely with engineering teams, drive architectural decisions, and ensure b...Show moreLast updated: 1 day ago
  • Promoted
Senior DGX Cloud AI Infrastructure Software Engineer

Senior DGX Cloud AI Infrastructure Software Engineer

NVIDIASanta Clara, CA, United States
Full-time
Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing efficiency and resiliency of AI workloa...Show moreLast updated: 30+ days ago
  • Promoted
Principal Cloud Architect AWS Bedrock & LLM Workflows

Principal Cloud Architect AWS Bedrock & LLM Workflows

Mogi I / O : OTT / Podcast / Short Video Apps for youSan Francisco, CA, United States
Full-time
Principal Cloud Architect AWS Bedrock & LLM Workflows.Location : USA Bay Area, California.Work Type : Full-Time (Hybrid / On-site). Experience Required : 815 Years in Cloud Architecture and AI Systems....Show moreLast updated: 1 day ago
  • Promoted
AI Platform Engineer, Infrastructure

AI Platform Engineer, Infrastructure

Brain Co.San Francisco, CA, United States
Full-time
Applied AI startup founded by Elad Gil and Jared Kushner, and backed by many of Silicon Valley’s leading builders — including Patrick Collison (CEO of Stripe), Andrej Karpathy (Cofounder of OpenAI)...Show moreLast updated: 5 days ago
  • Promoted
Senior Solutions Architect, AI Infrastructure

Senior Solutions Architect, AI Infrastructure

NVIDIASanta Clara, CA, United States
Full-time
NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer.Do you want to be part of a team that brings new Artificial Intelligence (AI) hardware and software techn...Show moreLast updated: 1 day ago
  • Promoted
Senior Cloud Infrastructure Engineer

Senior Cloud Infrastructure Engineer

Harrison ClarkeSan Francisco, CA, United States
Full-time
Annual Bonus, Sign-on bonus, RSUs, and Stock options.Join a dynamic startup seeking an infrastructure specialist to design, scale, and maintain cutting-edge infrastructure that powers innovative di...Show moreLast updated: 1 day ago
  • Promoted
Senior Cloud Solutions Engineer

Senior Cloud Solutions Engineer

LambdaSan Francisco, CA, United States
Full-time
Lambda, The Superintelligence Cloud, builds Gigawatt-scale AI Factories for Training and Inference.Lambda's mission is to make compute as ubiquitous as electricity and give every person access to a...Show moreLast updated: 30+ days ago
  • Promoted
Senior DGX Cloud Performance Engineer

Senior DGX Cloud Performance Engineer

NVIDIASanta Clara, CA, United States
Full-time
NVIDIA DGX™ Cloud is an end-to-end, scalable AI platform for developers, offering scalable capacity built on the latest NVIDIA architecture and co-engineered with the world’s leading cloud service ...Show moreLast updated: 30+ days ago
  • Promoted
DGX Cloud Performance Engineer

DGX Cloud Performance Engineer

NVIDIASanta Clara, CA, United States
Full-time
NVIDIA DGX™ Cloud is an end-to-end, scalable AI platform for developers, offering scalable capacity built on the latest NVIDIA architecture and co-engineered with the world’s leading cloud service ...Show moreLast updated: 1 day ago
  • Promoted
Solutions Engineer (AI Cloud Infrastructure)

Solutions Engineer (AI Cloud Infrastructure)

Novita AISan Francisco, CA, United States
Full-time
We are a high-growth, global AI cloud infrastructure provider at the forefront of the artificial intelligence revolution. Our cutting-edge platform offers developers and enterprises powerful, scalab...Show moreLast updated: 20 days ago
  • Promoted
Senior AI Infrastructure Engineer

Senior AI Infrastructure Engineer

LanceDBSan Francisco, CA, United States
Full-time
LanceDB is a developer-friendly, open-source data lake for multimodal AI.From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of ...Show moreLast updated: 1 day ago
  • Promoted
Solutions Architect, DGX Cloud

Solutions Architect, DGX Cloud

NVIDIASanta Clara, CA, United States
Full-time
Do you want to be part of the team that brings Artificial Intelligence (AI) emerging technology to the field? We are looking for a hardworking Solution Architect (SA) to join the DGX Cloud SA Segme...Show moreLast updated: 30+ days ago
  • Promoted
Senior Cloud Infrastructure Engineer

Senior Cloud Infrastructure Engineer

LanceDBSan Francisco, CA, United States
Full-time
From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI appli...Show moreLast updated: 10 days ago
  • Promoted
Lead Cloud Architect | AWS | AI Solutions

Lead Cloud Architect | AWS | AI Solutions

Mogi I / O : OTT / Podcast / Short Video Apps for youSan Francisco, CA, United States
Full-time
Lead Cloud Architect | AWS | AI Solutions.Location : USA Bay Area, California.Work Type : Full-Time (Hybrid / On-site).Experience Required : 815 Years in Cloud Architecture and AI Systems.Compensation...Show moreLast updated: 1 day ago
  • Promoted
Principal Cloud and Infrastructure Architect - Digital

Principal Cloud and Infrastructure Architect - Digital

IntuitiveSunnyvale, CA, United States
Full-time
At Intuitive, we are united behind our mission : we believe that minimally invasive care is life-enhancing care.Through ingenuity and intelligent technology, we expand the potential of physicians to...Show moreLast updated: 30+ days ago
  • Promoted
Senior Solutions Architect, AI Cloud Services

Senior Solutions Architect, AI Cloud Services

NVIDIASanta Clara, CA, United States
Full-time
Interested in being a part of a team that brings Artificial Intelligence (AI) to some of the biggest customers in the world? NVIDIA is looking for an experienced Solutions Architect to assist custo...Show moreLast updated: 1 day ago
  • Promoted
Senior GCP Solutions Architect, Cloud Intelligence

Senior GCP Solutions Architect, Cloud Intelligence

AmazonSan Francisco, CA, United States
Full-time
Senior GCP Solutions Architect, Cloud Intelligence.Senior GCP Solutions Architect, Cloud Intelligence.As a Senior Google Cloud Platform (GCP) Cloud Intelligence (CI) SA, you will be a leader with d...Show moreLast updated: 30+ days ago