Talent.com
AI System Solution Architect
AI System Solution ArchitectCango Inc. • San Francisco, CA, United States
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, CA, United States
14 hours ago
Job type
  • Full-time
Job description

Join Cango Inc as a Senior Solutions Architect focusing on LLM and diffusion model inference on large-scale GPU clusters.

Responsibilities

  • Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.
  • Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / batching strategies, memory allocation, and P2P / IB communication.
  • Architect a multi-tenant serving framework that balances throughput, latency, and cost.
  • Define product positioning and differentiation based on industry trends and company strategy.
  • Develop technical evolution plans (e.g., token streaming like vLLM, syntax parsing like SGLang, Diffusion acceleration).
  • Align closely with internal GPU infrastructure and business teams to ensure timely product delivery.
  • Lead performance engineering efforts including NCCL tuning, NUMA binding, CUDA kernel optimization.
  • Drive cross-team collaboration (GPU kernel, compiler, distributed system, frontend APIs) to ensure system stability and scalability.
  • Organize benchmarking and performance testing against industry leaders (vLLM, SGLang, TensorRT, etc.).
  • Guide engineering team on implementation strategies, experimental methodologies, and optimization pathways.
  • Engage with open-source communities and contribute core components to enhance technical influence.
  • Communicate directly with North America-based clients to understand their needs for AI inference, training, and deployment.
  • Translate customer needs into internal implementation plans and coordinate across operations, engineering, and delivery teams.

Qualifications

  • 5+ years of experience in computer infrastructure, GPU cloud, or large-scale cloud computing in the U.S., with a deep understanding of the North American tech ecosystem.
  • Master’s or Ph.D. in Computer Science, Electrical Engineering, or related fields preferred.
  • 5+ years of hands‑on experience in deep learning systems or GPU optimization, including leading the design of at least one large‑scale AI inference or training system.
  • Proficiency with PyTorch, CUDA, NCCL, Triton, TensorRT, MPI / IB / RDMA, etc.
  • Deep understanding of projects like vLLM, SGLang, DeepSpeed, FasterTransformer.
  • Practical experience in LLM inference optimization (e.g., KV Cache, P2P vs CPU routing, batching strategies).
  • Ability to integrate system‑level optimization with product usability (API and Serving layers).
  • Strong architectural thinking and cross‑functional communication skills to translate complexity into clear product roadmaps.
  • Preferred

  • Open‑source contributions (e.g., to vLLM, DeepSpeed, Ray, Triton‑Server, SGLang, etc.).
  • Experience launching GPU cloud or AI infrastructure products (e.g., RunPod, Lambda, Modal, SageMaker).
  • Familiarity with emerging LLM inference trends such as speculative decoding, continuous batching, and streaming inference.
  • What We Offer

  • Hands‑on opportunity to manage and optimize GPU clusters at multi‑thousand‑card scale, operating at the forefront of global compute infrastructure.
  • Strategic partner role in both product architecture and business decisions alongside core leadership team.
  • Key role in building the next‑generation GPU‑based AI inference infrastructure.
  • High degree of autonomy in product and architectural decisions.
  • Competitive compensation package with equity incentives.
  • Global team and access to cross‑regional GPU cluster resources.
  • #J-18808-Ljbffr

    Create a job alert for this search

    Solution Architect • San Francisco, CA, United States

    Related jobs
    Sr. Solution Architect

    Sr. Solution Architect

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Kafka Solution Architect

    Kafka Solution Architect

    VirtualVocations • Fremont, California, United States
    Full-time
    A company is looking for a Kafka Solution Architect - Remote.Key Responsibilities Architect and deliver software products and solutions using Kafka Design and implement cloud environments and op...Show more
    Last updated: 1 day ago • Promoted
    AI Solutions Engineer

    AI Solutions Engineer

    VirtualVocations • Fremont, California, United States
    Full-time
    A company is looking for an AI Solutions Engineer to build and deploy intelligent, real-time solutions for clients using platforms like Cresta and Kore. Key Responsibilities : Configure, develop, a...Show more
    Last updated: 30+ days ago • Promoted
    AI Solution Architect

    AI Solution Architect

    VirtualVocations • Hayward, California, United States
    Full-time
    A company is looking for an AI Solution Architect to provide technical leadership and oversight for AI solutions in Microsoft Azure environments. Key Responsibilities Serve as the technical and ar...Show more
    Last updated: 13 hours ago • Promoted • New!
    Systems Architect

    Systems Architect

    Reliable Robotics • Mountain View, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more
    Last updated: 30+ days ago • Promoted
    Senior Solutions Architect

    Senior Solutions Architect

    VirtualVocations • Santa Clara, California, United States
    Full-time
    A company is looking for a Senior Solutions Architect, Data Processing.Key Responsibilities Research and develop techniques to GPU-accelerate high-performance databases, ETL, and data analytics a...Show more
    Last updated: 30+ days ago • Promoted
    Presales Solution Architect

    Presales Solution Architect

    VirtualVocations • Concord, California, United States
    Full-time
    A company is looking for a Presales Solution Architect (US).Key Responsibilities Support pre-sales efforts for data collection, annotation, and evaluation projects Collaborate with clients to id...Show more
    Last updated: 30+ days ago • Promoted
    Sales Solution Architect - Analog & Mixed-Signal Platforms

    Sales Solution Architect - Analog & Mixed-Signal Platforms

    Synopsys • Sunnyvale, CA, United States
    Full-time
    At Synopsys, we drive the innovations that shape the way we live and connect.Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines.We lead in c...Show more
    Last updated: 6 days ago • Promoted
    AI Architect

    AI Architect

    VirtualVocations • Concord, California, United States
    Full-time
    A company is looking for an AI Architect & Strategist to shape its enterprise AI strategy and drive responsible AI adoption. Key Responsibilities Develop and evolve the company's AI strategy in al...Show more
    Last updated: 30+ days ago • Promoted
    Solution Acceleration Architect

    Solution Acceleration Architect

    Twilio • San Francisco, CA, United States
    Full-time
    At Twilio, we're shaping the future of communications, all from the comfort of our homes.We deliver innovative solutions to. As we continue to revolutionize how the world interacts, we're acquiring ...Show more
    Last updated: 12 days ago • Promoted
    System Architect, Quantum Networking

    System Architect, Quantum Networking

    PsiQuantum • Palo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more
    Last updated: 30+ days ago • Promoted
    Anaplan Solution Architect

    Anaplan Solution Architect

    Anaplan • San Francisco, CA, United States
    Full-time
    At Anaplan, we are a team of innovators focused on optimizing business decision-making through our leading AI-infused scenario planning and analysis platform so our customers can outpace their comp...Show more
    Last updated: 18 days ago • Promoted
    Anaplan Solution Architect

    Anaplan Solution Architect

    VirtualVocations • San Francisco, California, United States
    Full-time
    A company is looking for a Senior Anaplan Architect to lead the design, development, and deployment of scalable Anaplan models across complex enterprise environments. Key Responsibilities Lead end...Show more
    Last updated: 30+ days ago • Promoted
    AI Systems Engineer

    AI Systems Engineer

    VirtualVocations • Fremont, California, United States
    Full-time
    A company is looking for an Engineering & AI Systems Engineer to design and implement internal tools that enhance operational efficiency. Key Responsibilities Build and deploy internal tools to ad...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Solution Architect - Enterprise

    Sr. Solution Architect - Enterprise

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Solution Engineer - HPC & AI Systems

    Sr. Solution Engineer - HPC & AI Systems

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 9 hours ago • Promoted • New!
    Solution Architect - Presales

    Solution Architect - Presales

    Informatica LLC • Redwood City, CA, United States
    Full-time
    Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show more
    Last updated: 8 days ago • Promoted
    Workday Solution Architect

    Workday Solution Architect

    VirtualVocations • Concord, California, United States
    Full-time
    A company is looking for a Workday Enterprise Solution Architect who will play a pivotal role in their internal technology team. Key Responsibilities Transform business requirements into scalable ...Show more
    Last updated: 23 days ago • Promoted
    Presales Solution Architect

    Presales Solution Architect

    Informatica LLC • Redwood City, CA, United States
    Full-time
    Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show more
    Last updated: 8 days ago • Promoted
    System Architect, Simulations & Models

    System Architect, Simulations & Models

    PsiQuantum • Palo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more
    Last updated: 30+ days ago • Promoted