Talent.com
AI System Solution Architect
AI System Solution ArchitectCango Inc. • San Francisco, California, United States
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, California, United States
5 days ago
Job type
  • Full-time
Job description

Responsibilities

Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.

Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / batching strategies, memory allocation, and P2P / IB communication.

Architect a multi-tenant serving framework that balances throughput, latency, and cost.

Define product positioning and differentiation based on industry trends and company strategy.

Develop technical evolution plans (e.g., token streaming like vLLM, syntax parsing like SGLang, Diffusion acceleration).

Align closely with internal GPU infrastructure and business teams to ensure timely product delivery.

Lead performance engineering efforts including NCCL tuning, NUMA binding, CUDA kernel optimization.

Drive cross-team collaboration (GPU kernel, compiler, distributed system, frontend APIs) to ensure system stability and scalability.

Organize benchmarking and performance testing against industry leaders (vLLM, SGLang, TensorRT, etc.).

Guide engineering team on implementation strategies, experimental methodologies, and optimization pathways.

Engage with open-source communities and contribute core components to enhance technical influence.

Communicate directly with North America-based clients to understand their needs for AI inference, training, and deployment.

Translate customer needs into internal implementation plans and coordinate across operations, engineering, and delivery teams.

Qualifications

5+ years of experience in computer infrastructure, GPU cloud, or large-scale cloud computing in the U.S., with a deep understanding of the North American tech ecosystem.

Master’s or Ph.D. in Computer Science, Electrical Engineering, or related fields preferred.

5+ years of hands‑on experience in deep learning systems or GPU optimization, including leading the design of at least one large‑scale AI inference or training system.

Proficiency with PyTorch, CUDA, NCCL, Triton, TensorRT, MPI / IB / RDMA, etc.

Deep understanding of projects like vLLM, SGLang, DeepSpeed, FasterTransformer.

Core Competencies

Practical experience in LLM inference optimization (e.g., KV Cache, P2P vs CPU routing, batching strategies).

Ability to integrate system‑level optimization with product usability (API and serving layers).

Strong architectural thinking and cross‑functional communication skills to translate complexity into clear product roadmaps.

Preferred

Open‑source contributions (e.g., to vLLM, DeepSpeed, Ray, Triton‑Server, SGLang, etc.).

Experience launching GPU cloud or AI infrastructure products (e.g., RunPod, Lambda, Modal, SageMaker).

Familiarity with emerging LLM inference trends such as speculative decoding, continuous batching, and streaming inference.

What We Offer

Hands‑on opportunity to manage and optimize GPU clusters at multi‑thousand‑card scale, operating at the forefront of global compute infrastructure.

Strategic partner role in both product architecture and business decisions alongside core leadership team.

Key role in building the next‑generation GPU‑based AI inference infrastructure.

High degree of autonomy in product and architectural decisions.

Competitive compensation package with equity incentives.

Global team and access to cross‑regional GPU cluster resources.

Job Details

Seniority Level : Mid‑Senior Level

Employment Type : Full‑time

Job Function : Information Technology

Industries : Technology, Information and Internet

#J-18808-Ljbffr

Create a job alert for this search

Solution Architect • San Francisco, California, United States

Related jobs
Solution Architect – Agentic AI Platform

Solution Architect – Agentic AI Platform

Droisys • Menlo Park, CA, United States
Full-time
Solution Architect – Agentic AI Platform.Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution.W...Show more
Last updated: 11 days ago • Promoted
Senior Enterprise AI Solutions Architect

Senior Enterprise AI Solutions Architect

Anaplan • San Francisco, CA, United States
Full-time
A global technology firm in San Francisco is seeking a Senior Solution Consultant with extensive experience in technical strategy and business transformation. You will work closely with enterprise c...Show more
Last updated: 14 days ago • Promoted
Senior AI API Solutions Architect

Senior AI API Solutions Architect

Glean • San Francisco, CA, United States
Full-time
A leading Work AI platform is seeking a Senior Solution Architect to enhance the use of their APIs and SDKs in internal AI applications. The role involves collaborating with engineers and customers ...Show more
Last updated: 15 hours ago • Promoted • New!
AI Solution Architect / Senior AI Solution Architect (Post-Sales)

AI Solution Architect / Senior AI Solution Architect (Post-Sales)

C3 AI • Redwood City, CA, United States
Full-time
C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show more
Last updated: 30+ days ago • Promoted
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, CA, United States
Full-time
Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / ba...Show more
Last updated: 30+ days ago • Promoted
Solutions Architect, Applied AI

Solutions Architect, Applied AI

Anthropic • San Francisco, CA, United States
Full-time
Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
Last updated: 30+ days ago • Promoted
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Nivalto • San Francisco, CA, United States
Full-time
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure.Senior AI Architect – Multi-Agent Systems & Platform Infrastructure. Senior AI Architect – Multi-Agent Systems & Platform Infrastr...Show more
Last updated: 30+ days ago • Promoted
Solutions Architect, Applied AI

Solutions Architect, Applied AI

Menlo Ventures • San Francisco, CA, United States
Full-time
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
Last updated: 30+ days ago • Promoted
Solution Architect - Presales

Solution Architect - Presales

Informatica LLC • Redwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show more
Last updated: 30+ days ago • Promoted
Conversational AI Solution Architect

Conversational AI Solution Architect

Empower Staffing • San Francisco, CA, United States
Full-time
Our client, a leader in healthcare innovation for over a decade, is seeking a seasoned AI Solution Architect to join its AI Platform Partnership team. In this position, you’ll be one of five team me...Show more
Last updated: 30+ days ago • Promoted
Principal AI Solutions Architect

Principal AI Solutions Architect

Oracle • Redwood City, California, USA
Full-time
As an AI Solutions Architect you will be responsible for creating architectural blueprints developing scalable AI solutions and guiding engineering teams on best practices.You will directly impact ...Show more
Last updated: 13 days ago • Promoted
Business Systems Architect, AI

Business Systems Architect, AI

Figma • San Francisco, California, USA
Full-time
Figma is growing our team of passionate creatives and builders on a mission to make design accessible to all.Figmas platform helps teams bring ideas to lifewhether youre brainstorming creating a pr...Show more
Last updated: 25 days ago • Promoted
Solutions Architect, Forward Deployed

Solutions Architect, Forward Deployed

Synchro • San Francisco, CA, United States
Permanent
We're seeking an experienced AI Solution Architect to join a small, client-embedded team focused on delivering AI-driven operational improvements for healthcare organizations.You'll work closely wi...Show more
Last updated: 30+ days ago • Promoted
Senior AI Solutions Architect, Developer Platform

Senior AI Solutions Architect, Developer Platform

Cloudflare, Inc. • San Francisco, CA, US
Full-time
A leading tech company in San Francisco seeks a Principal Solution Architect to drive revenue for the AI / Developer Platform. Responsibilities include advising on sales strategies and collaborating w...Show more
Last updated: 2 days ago • Promoted
Lead AI Agent Systems Architect

Lead AI Agent Systems Architect

Salesforce, Inc. • San Francisco, CA, United States
Full-time
A leading technology company in San Francisco is looking for a Lead Member of Technical Staff to manage the design and development of advanced AI agents. The ideal candidate will have over 8 years o...Show more
Last updated: 9 days ago • Promoted
Solution Architect - AI & ML - Consumer Business Group

Solution Architect - AI & ML - Consumer Business Group

TEPHRA • San Francisco, CA, United States
Full-time
Role : Solution Architect - Artificial Intelligence & Machine Learning - Consumer Business Group.Location : San Francisco, CA (other US Locations can be considered). Minimum of 8 years of professional...Show more
Last updated: 30+ days ago • Promoted
Enterprise AI Solutions Architect - NA

Enterprise AI Solutions Architect - NA

Ema Unlimited, Inc. • San Francisco, CA, US
Full-time
A leading technology firm in San Francisco is seeking a Solutions Architect to drive customer success with Ema's innovative AI solutions. In this role, you will collaborate closely with sales leader...Show more
Last updated: less than 1 hour ago • Promoted • New!
Solutions Architect, Generative AI Deployment

Solutions Architect, Generative AI Deployment

OpenAI • San Francisco, CA, United States
Full-time
The Solutions Architecture team ensures the safe and effective deployment of Generative AI applications for developers and enterprises. We act as trusted advisors and technical partners to our custo...Show more
Last updated: 30+ days ago • Promoted