Talent.com
AI System Solution Architect
AI System Solution ArchitectCango Inc. • San Francisco, California, United States
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, California, United States
8 days ago
Job type
  • Full-time
Job description

Responsibilities

Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.

Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / batching strategies, memory allocation, and P2P / IB communication.

Architect a multi-tenant serving framework that balances throughput, latency, and cost.

Define product positioning and differentiation based on industry trends and company strategy.

Develop technical evolution plans (e.g., token streaming like vLLM, syntax parsing like SGLang, Diffusion acceleration).

Align closely with internal GPU infrastructure and business teams to ensure timely product delivery.

Lead performance engineering efforts including NCCL tuning, NUMA binding, CUDA kernel optimization.

Drive cross-team collaboration (GPU kernel, compiler, distributed system, frontend APIs) to ensure system stability and scalability.

Organize benchmarking and performance testing against industry leaders (vLLM, SGLang, TensorRT, etc.).

Guide engineering team on implementation strategies, experimental methodologies, and optimization pathways.

Engage with open-source communities and contribute core components to enhance technical influence.

Communicate directly with North America-based clients to understand their needs for AI inference, training, and deployment.

Translate customer needs into internal implementation plans and coordinate across operations, engineering, and delivery teams.

Qualifications

5+ years of experience in computer infrastructure, GPU cloud, or large-scale cloud computing in the U.S., with a deep understanding of the North American tech ecosystem.

Master’s or Ph.D. in Computer Science, Electrical Engineering, or related fields preferred.

5+ years of hands‑on experience in deep learning systems or GPU optimization, including leading the design of at least one large‑scale AI inference or training system.

Proficiency with PyTorch, CUDA, NCCL, Triton, TensorRT, MPI / IB / RDMA, etc.

Deep understanding of projects like vLLM, SGLang, DeepSpeed, FasterTransformer.

Core Competencies

Practical experience in LLM inference optimization (e.g., KV Cache, P2P vs CPU routing, batching strategies).

Ability to integrate system‑level optimization with product usability (API and serving layers).

Strong architectural thinking and cross‑functional communication skills to translate complexity into clear product roadmaps.

Preferred

Open‑source contributions (e.g., to vLLM, DeepSpeed, Ray, Triton‑Server, SGLang, etc.).

Experience launching GPU cloud or AI infrastructure products (e.g., RunPod, Lambda, Modal, SageMaker).

Familiarity with emerging LLM inference trends such as speculative decoding, continuous batching, and streaming inference.

What We Offer

Hands‑on opportunity to manage and optimize GPU clusters at multi‑thousand‑card scale, operating at the forefront of global compute infrastructure.

Strategic partner role in both product architecture and business decisions alongside core leadership team.

Key role in building the next‑generation GPU‑based AI inference infrastructure.

High degree of autonomy in product and architectural decisions.

Competitive compensation package with equity incentives.

Global team and access to cross‑regional GPU cluster resources.

Job Details

Seniority Level : Mid‑Senior Level

Employment Type : Full‑time

Job Function : Information Technology

Industries : Technology, Information and Internet

#J-18808-Ljbffr

Create a job alert for this search

Solution Architect • San Francisco, California, United States

Related jobs
AI Solutions Architect

AI Solutions Architect

Genesys Cloud Services, Inc. • Menlo Park, CA, US
Full-time
Job Summary We are looking for a hands-on AI Solutions Architect with deep expertise in CCaaS solutions and AI-driven technologies. This role is responsible for both designing and delivering AI...Show more
Last updated: 17 days ago • Promoted
Enterprise AI Architect & Systems Leader

Enterprise AI Architect & Systems Leader

EY • San Francisco, CA, United States
Full-time
A leading global professional services firm is looking for a Lead Software & AI Architect to define and deliver architecture for scalable solutions in San Francisco. The role involves overseeing eng...Show more
Last updated: 3 days ago • Promoted
Lead AI Agent Systems Architect

Lead AI Agent Systems Architect

Salesforce, Inc. • San Francisco, CA, US
Full-time
A leading technology company in San Francisco is looking for a Lead Member of Technical Staff to manage the design and development of advanced AI agents. The ideal candidate will have over 8 years o...Show more
Last updated: 12 days ago • Promoted
Senior AI API Solutions Architect

Senior AI API Solutions Architect

Glean • San Francisco, CA, United States
Full-time
A leading Work AI platform is seeking a Senior Solution Architect to enhance the use of their APIs and SDKs in internal AI applications. The role involves collaborating with engineers and customers ...Show more
Last updated: 4 days ago • Promoted
Enterprise AI Solutions Architect – NA

Enterprise AI Solutions Architect – NA

Ema Unlimited, Inc. • San Francisco, CA, United States
Full-time
A leading technology firm in San Francisco is seeking a Solutions Architect to drive customer success with Ema's innovative AI solutions. In this role, you will collaborate closely with sales leader...Show more
Last updated: 4 days ago • Promoted
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, CA, United States
Full-time
Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / ba...Show more
Last updated: 30+ days ago • Promoted
Solution Architect

Solution Architect

Gruve • Redwood City, CA, US
Full-time
About Gruve Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses.We specialize in cybersecurity, customer experience, cloud infrastructure, and a...Show more
Last updated: 22 days ago • Promoted
Solution Architect - Agentic AI Platform

Solution Architect - Agentic AI Platform

Droisys • Menlo Park, CA, US
Full-time
Solution Architect – Agentic AI Platform.Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution.W...Show more
Last updated: 14 days ago • Promoted
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Nivalto • San Francisco, CA, United States
Full-time
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure.Senior AI Architect – Multi-Agent Systems & Platform Infrastructure. Senior AI Architect – Multi-Agent Systems & Platform Infrastr...Show more
Last updated: 30+ days ago • Promoted
Applied AI, Partner Solution Architect

Applied AI, Partner Solution Architect

Anthropic • San Francisco, CA, US
Full-time
About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole.Our team is a quickl...Show more
Last updated: 22 days ago • Promoted
Senior AI Solutions Architect for SaaS Integrations

Senior AI Solutions Architect for SaaS Integrations

Intercom • San Francisco, CA, United States
Full-time
A leading customer service technology provider is looking for a Senior Solutions Architect to help clients integrate and optimize their use of the Intercom platform. You will serve as a trusted tech...Show more
Last updated: 1 day ago • Promoted
AI Solutions Architect

AI Solutions Architect

VirtualVocations • Oakland, California, United States
Full-time
A company is looking for an AI Solutions Architect to lead the design and implementation of AI-driven solutions across the enterprise. Key Responsibilities Design scalable AI / ML solution architect...Show more
Last updated: 30+ days ago • Promoted
Solution Architect - Presales

Solution Architect - Presales

Informatica LLC • Redwood City, CA, United States
Full-time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show more
Last updated: 30+ days ago • Promoted
Conversational AI Solution Architect

Conversational AI Solution Architect

Empower Staffing • San Francisco, CA, US
Full-time
Overview Our client, a leader in healthcare innovation for over a decade, is seeking a seasoned AI Solution Architect to join its AI Platform Partnership team. In this position, you'll be one of fi...Show more
Last updated: 22 days ago • Promoted
Senior Enterprise AI Solutions Architect

Senior Enterprise AI Solutions Architect

Anaplan • San Francisco, CA, US
Full-time
A global technology firm in San Francisco is seeking a Senior Solution Consultant with extensive experience in technical strategy and business transformation. You will work closely with enterprise c...Show more
Last updated: 17 days ago • Promoted
AI Solutions Architect

AI Solutions Architect

Jobs via Dice • San Francisco, CA, United States
Full-time
San Francisco Bay Area / Palo Alto / Remote / Hybrid.ShimentoX is seeking a strong, hands‑on.Design and architect end‑to‑end AI / GenAI solutions for enterprise customers. Build and integrate LLM‑base...Show more
Last updated: 2 days ago • Promoted
Senior AI Solutions Architect, Developer Platform

Senior AI Solutions Architect, Developer Platform

Cloudflare, Inc. • San Francisco, CA, US
Full-time
A leading tech company in San Francisco seeks a Principal Solution Architect to drive revenue for the AI / Developer Platform. Responsibilities include advising on sales strategies and collaborating w...Show more
Last updated: 5 days ago • Promoted
Agentic AI Solution Architect

Agentic AI Solution Architect

Avenue Code • San Francisco, CA, US
Full-time
Avenue Code is the leading software consultancy focused on delivering end-to-end development solutions for digital transformation across every vertical. We're privately held, profitable, and have be...Show more
Last updated: 22 days ago • Promoted