Talent.com
AI System Solution Architect

AI System Solution Architect

Cango Inc.San Francisco, California, United States
13 hours ago
Job type
  • Full-time
Job description

Join Cango Inc as a Senior Solutions Architect focusing on LLM and diffusion model inference on large-scale GPU clusters.

Responsibilities

Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.

Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / batching strategies, memory allocation, and P2P / IB communication.

Architect a multi-tenant serving framework that balances throughput, latency, and cost.

Define product positioning and differentiation based on industry trends and company strategy.

Develop technical evolution plans (e.g., token streaming like vLLM, syntax parsing like SGLang, Diffusion acceleration).

Align closely with internal GPU infrastructure and business teams to ensure timely product delivery.

Lead performance engineering efforts including NCCL tuning, NUMA binding, CUDA kernel optimization.

Drive cross-team collaboration (GPU kernel, compiler, distributed system, frontend APIs) to ensure system stability and scalability.

Organize benchmarking and performance testing against industry leaders (vLLM, SGLang, TensorRT, etc.).

Guide engineering team on implementation strategies, experimental methodologies, and optimization pathways.

Engage with open-source communities and contribute core components to enhance technical influence.

Communicate directly with North America-based clients to understand their needs for AI inference, training, and deployment.

Translate customer needs into internal implementation plans and coordinate across operations, engineering, and delivery teams.

Qualifications

5+ years of experience in computer infrastructure, GPU cloud, or large-scale cloud computing in the U.S., with a deep understanding of the North American tech ecosystem.

Master’s or Ph.D. in Computer Science, Electrical Engineering, or related fields preferred.

5+ years of hands‑on experience in deep learning systems or GPU optimization, including leading the design of at least one large‑scale AI inference or training system.

Proficiency with PyTorch, CUDA, NCCL, Triton, TensorRT, MPI / IB / RDMA, etc.

Deep understanding of projects like vLLM, SGLang, DeepSpeed, FasterTransformer.

Practical experience in LLM inference optimization (e.g., KV Cache, P2P vs CPU routing, batching strategies).

Ability to integrate system‑level optimization with product usability (API and Serving layers).

Strong architectural thinking and cross‑functional communication skills to translate complexity into clear product roadmaps.

Preferred

Open‑source contributions (e.g., to vLLM, DeepSpeed, Ray, Triton‑Server, SGLang, etc.).

Experience launching GPU cloud or AI infrastructure products (e.g., RunPod, Lambda, Modal, SageMaker).

Familiarity with emerging LLM inference trends such as speculative decoding, continuous batching, and streaming inference.

What We Offer

Hands‑on opportunity to manage and optimize GPU clusters at multi‑thousand‑card scale, operating at the forefront of global compute infrastructure.

Strategic partner role in both product architecture and business decisions alongside core leadership team.

Key role in building the next‑generation GPU‑based AI inference infrastructure.

High degree of autonomy in product and architectural decisions.

Competitive compensation package with equity incentives.

Global team and access to cross‑regional GPU cluster resources.

#J-18808-Ljbffr

Create a job alert for this search

Solution Architect • San Francisco, California, United States

Related jobs
  • Promoted
Conversational AI Solution Architect

Conversational AI Solution Architect

Empower StaffingSan Francisco, CA, United States
Full-time
Our client, a leader in healthcare innovation for over a decade, is seeking a seasoned AI Solution Architect to join its AI Platform Partnership team. In this position, youll be one of five team mem...Show moreLast updated: 23 hours ago
  • Promoted
Solutions Architect, Agentic AI

Solutions Architect, Agentic AI

NVIDIASanta Clara, CA, United States
Full-time
Do you want to drive the future of AI by building agentic AI applications at scale? We are looking for Solution Architects to join the NVIDIA AI Enterprise (NVAIE) SA Segment Team to help redefine ...Show moreLast updated: 30+ days ago
  • Promoted
Senior Solutions Architect, HPC and AI

Senior Solutions Architect, HPC and AI

NVIDIASanta Clara, CA, United States
Full-time
NVIDIA is looking for a Field Escalation Solution Architect with experience in validation and debugging of large-scale GPU clusters focused on performance. As part of the Solution Architecture organ...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Solutions Architect, AI Models

Solutions Architect, AI Models

NVIDIASanta Clara, CA, United States
Full-time
Do you want to be part of the team that brings innovative Artificial Intelligence (AI) from research to reality? We are looking for a Solution Architect to join the NVIDIA AI Enterprise (NVAIE) SA ...Show moreLast updated: 22 hours ago
  • Promoted
Sales Solution Architect - Analog & Mixed-Signal Platforms

Sales Solution Architect - Analog & Mixed-Signal Platforms

SynopsysSunnyvale, CA, United States
Full-time
At Synopsys, we drive the innovations that shape the way we live and connect.Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines.We lead in c...Show moreLast updated: 27 days ago
  • Promoted
Senior Solutions Architect - Enterprise AI

Senior Solutions Architect - Enterprise AI

NVIDIASanta Clara, CA, United States
Full-time
We are now looking for a Senior Solution Network Architect, Enterprise Products! Join the NVIDIA Solution Architect to support Enterprise Products team as a Senior Network Architect, where your pas...Show moreLast updated: 30+ days ago
  • Promoted
Senior Technical Systems AI Architect - Agentic AI

Senior Technical Systems AI Architect - Agentic AI

NVIDIASanta Clara, CA, United States
Full-time
Join the AI & Automation Supply Chain Business Transformations team in NVIDIA Operations to architect the future of agentic AI at scale. We Implement AI agents to transform employee productivity and...Show moreLast updated: 23 hours ago
  • Promoted
AI Solution Architect / Senior AI Solution Architect (Post-Sales)

AI Solution Architect / Senior AI Solution Architect (Post-Sales)

C3.ai, Inc.Redwood City, CA, United States
Full-time
C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show moreLast updated: 30+ days ago
  • Promoted
Anaplan Solution Architect

Anaplan Solution Architect

AnaplanSan Francisco, CA, United States
Full-time
At Anaplan, we are a team of innovators focused on optimizing business decision-making through our leading AI-infused scenario planning and analysis platform so our customers can outpace their comp...Show moreLast updated: 30+ days ago
  • Promoted
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIASanta Clara, CA, United States
Full-time
Senior Solution Architect, HPC and AI - NVIS.Join to apply for the Senior Solution Architect, HPC and AI - NVIS role at NVIDIA. Do you want to be part of the team that brings Artificial Intelligence...Show moreLast updated: 30+ days ago
  • Promoted
Solution Architect - Datacenter / AI Solutions

Solution Architect - Datacenter / AI Solutions

SupermicroSan Jose, CA, United States
Full-time
Solution Architect - Datacenter / AI Solutions at Super Micro Computer in San Jose, California, United States.Location : San Jose, California. Supermicro is a Top Tier provider of advanced server, st...Show moreLast updated: 11 days ago
  • Promoted
Solution Architect - Presales

Solution Architect - Presales

Informatica LLCRedwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show moreLast updated: 28 days ago
  • Promoted
Presales Solution Architect

Presales Solution Architect

Informatica LLCRedwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous, work-from-anywhere minds eager to so...Show moreLast updated: 28 days ago
  • Promoted
Expert AI Solutions Architect

Expert AI Solutions Architect

Delta Dental of CaliforniaSan Francisco, CA, United States
Full-time
The Expert Solutions Architect will primarily be responsible for architecting solutions for strategic initiatives and projects that combine new and existing applications, tools and technology to de...Show moreLast updated: 30+ days ago
  • Promoted
Field Service Management Solution Architect

Field Service Management Solution Architect

Celerity Consulting Group, Inc.Walnut Creek, CA, US
Full-time
Solution Architect (Field Service Management).Remote (work for home eligible) Travel may be required.Celerity is a consulting firm specializing in system integration solutions for the utilities and...Show moreLast updated: 1 day ago
  • Promoted
Principal AI Solution Architect

Principal AI Solution Architect

McKinsey & CompanySan Francisco, CA, United States
Full-time
Principal AI Solution Architect.Do you want to work on complex and pressing challenges-the kind that bring together curious, ambitious, and determined leaders who strive to become better every day?...Show moreLast updated: 23 hours ago
  • Promoted
System Architect, Simulations & Models

System Architect, Simulations & Models

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
Solution Architect - Conversational AI BFSI Industry

Solution Architect - Conversational AI BFSI Industry

TEPHRASan Francisco, CA, United States
Full-time
Role : Solution Architect - Conversational AI BFSI Industry.Location Options : Bay Area - CA preferred.The Conversational AI and Contact Center Architect is responsible for designing, developing, and...Show moreLast updated: 30+ days ago