Talent.com
GPGPU Software Architect/ Principal Engineer
GPGPU Software Architect/ Principal EngineerXPENG • Santa Clara, CA, United States
GPGPU Software Architect / Principal Engineer

GPGPU Software Architect / Principal Engineer

XPENG • Santa Clara, CA, United States
17 days ago
Job type
  • Full-time
Job description

XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity.

Our pioneering first-generation NPU, utilizing DSA architecture, has successfully entered mass production. We're currently validating the architecture of our second generation and are making the strategic decision to transition towards General Purpose GPU (GPGPU) architecture.

We're completely overhauling our software stack and embracing the CUDA ecosystem. Our goal is to achieve over 90% compatibility with cuBLAS / cuDNN on Linux across PCIe and CXL connections, all while delivering at least 1.3 times the performance of existing solutions on Transformer and Stable-Diffusion workloads.

Job Responsibilities :

Software Technical Strategy

  • Develop and refine a comprehensive 3-year roadmap for a software stack compatible with CUDA, encompassing Runtime, Driver, Compiler, Profiler, Debugger, and AI acceleration libraries
  • Define binding specifications that link our upcoming GPU ISA to CUDA APIs, ensuring forward compatibility with CUDA 12.x features
  • Evaluate and integrate the latest technological advancements : CUDA Graph, Transformer Engine, virtual memory management, CUDA dynamic CUTLASS 3.x, TMA, Blackwell FP4, among others

Architecture & Design

  • Create a modular, layered Runtime architecture : CUDA → HAL → Kernel → Hardware, applicable across emulators, and actual silicon
  • Define the task launch protocol, including Queue, Stream, Event, and Graph, as well as the memory model
  • Design a dual-mode (JIT & offline) compiler supporting LTO, PGO, Auto-Tuning, and efficient PTX→ISA microcode caching
  • Develop GPU virtualization schemes(MIG) that work across processes and containers
  • Performance & Observability

  • Implement an end-to-end performance model : Python API → CUDA Runtime → Driver → ISA → Micro-architecture → Board-level interconnect
  • Build an observability platform : Nsys-compatible traces, real-time Metric-QPS dashboards, and an AI Advisor for identifying bottlenecks automatically
  • Manage internal AI benchmarks as the single source of truth. Benchmark includes MLPerf Inference, Stable Diffusion XL, and 70B LLM
  • Cross-functional Collaboration

  • Co-design ISA which compatible with CUDA Compute Capability 12.x with our hardware architecture team
  • Collaborate with AI framework teams (PyTorch, TensorFlow, JAX, ONNX Runtime) to build fully reusable kernel libraries
  • Partner with Cloud and K8s teams to co-develop Device Plugins, GPU Operators, and RDMA Network Policies
  • Minimum Requirements :

  • 10 years + in systems software, with at least 5 years in designing CUDA Compute stacks
  • Led end-to-end development of a GPU Runtime or AI acceleration library generation
  • Comprehensive mastery of PTX / SASS, CUDA Driver API, and cuBLAS / cuDNN internals; experience with LLVM NVPTX backend
  • Profound understanding of GPU micro-architecture, including SM architecture, Warp Scheduler, Shared-Memory conflicts, and Tensor Core pipelines
  • Proficiency with PCIe / CXL / RDMA topologies, NUMA settings, and GPU Direct RDMA / Storage
  • The base salary range for this full-time position is $241,800 - $409,200 in addition to bonus, equity and benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.

    We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.

    Create a job alert for this search

    Principal Software Engineer • Santa Clara, CA, United States

    Related jobs
    Principal Software Engineer ( Sr Architect )

    Principal Software Engineer ( Sr Architect )

    Blue Yonder • Palo Alto, CA, United States
    Full-time
    We are seeking an experienced Principal Software Engineer to lead a team of product engineers in designing, developing, and implementing AI-driven solutions at Blue Yonder and provide strategic tec...Show more
    Last updated: 30+ days ago • Promoted
    Computer Science and Engineering Department : CROSS Practitioner in Residence Pool

    Computer Science and Engineering Department : CROSS Practitioner in Residence Pool

    University of California - Santa Cruz • Santa Cruz, CA, United States
    Full-time
    CROSS Practitioner in Residence (Junior, Assistant, Associate and Specialist ranks) .Commensurate with qualifications and experience. Represented Specialist Series Fiscal Year.A reasonable estima...Show more
    Last updated: 30+ days ago • Promoted
    Cloud Infrastructure Developer (Subject Matter Expert (SME))

    Cloud Infrastructure Developer (Subject Matter Expert (SME))

    Siri InfoSolutions Inc • Brookdale, California, USA
    Full-time
    I have an Urgent position as a.Cloud Infrastructure Developer (Subject Matter Expert (SME)).Role : Cloud Infrastructure Developer. The Client is seeking a highly skilled Infrastructure Subject Matter...Show more
    Last updated: 16 days ago • Promoted
    GCP Architect

    GCP Architect

    Inficare • Sunnyvale, CA, United States
    Full-time
    Create a well-informed cloud strategy and maintain the adaptation process.Evaluate cloud applications, hardware, and software. Identify the best cloud architecture solutions to meet the company's ne...Show more
    Last updated: 14 days ago • Promoted
    Software Architect - GPU AI Kernel Library

    Software Architect - GPU AI Kernel Library

    Intel • Santa Clara, CA, United States
    Full-time
    Intel is shaping the future of technology to help create a better future for the entire world.Our work in pushing forward fields like AI, analytics, and cloud-to-edge technology is at the heart of ...Show more
    Last updated: 17 days ago • Promoted
    AI Infrastructure Software Architect

    AI Infrastructure Software Architect

    KLA • Milpitas, CA, United States
    Full-time
    KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem.Virtually every electronic device in the world is produced using our technologies.No laptop, smartpho...Show more
    Last updated: 17 days ago • Promoted
    CPU / GPU / Processor Hardware Architect

    CPU / GPU / Processor Hardware Architect

    Baidu • Sunnyvale, CA, United States
    Full-time
    Do you want to be part of the AI revolution? Do you want to think out of the box, thriving on challenges in the AI industry and the desire to solve them? Do you want to work with a world-class team...Show more
    Last updated: 11 days ago • Promoted
    GCP Architect / Engineer

    GCP Architect / Engineer

    Veterans Sourcing Group LLC • Mountain View, CA, United States
    Full-time
    DevOps GCP Architecting / Engineering.Min 5 years of Engineering experience and min 3 years of GCP and 2 years of Kubernetes and familiarity with Airflow. Skills : GCP (GCP is required and any other al...Show more
    Last updated: 13 days ago • Promoted
    Principal Platform Software Engineer - OpenBMC Platform Architect

    Principal Platform Software Engineer - OpenBMC Platform Architect

    NVIDIA • Santa Clara, CA, US
    Full-time
    NVIDIA's invention of the GPU in 1999 fueled the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited m...Show more
    Last updated: 30+ days ago • Promoted
    Software Development Engineer

    Software Development Engineer

    Amazon • Santa Cruz, CA, USA
    Full-time
    Join Amazon's engineering team and help us build innovative solutions to complex problems.As a Software Development Engineer, you will design, develop, and test software applications and services.W...Show more
    Last updated: 22 days ago • Promoted
    Technical / Software Architect

    Technical / Software Architect

    Purple Drive • Mountain View, CA, United States
    Full-time
    Drive partner initiatives and projects through the entire project lifecycle including.Consult with partners and internal stakeholders to advise on technical solutions and.Guarantee the technical as...Show more
    Last updated: 17 days ago • Promoted
    Principal Software Architect

    Principal Software Architect

    T-Robotics FPC, Inc. • Fremont, CA, United States
    Full-time
    Freshly backed by some of the top VCs in Silicon Valley, T-robotics is redefining robotics by combining advanced AI for intuitive programming with pre-trained skill models that are experts at indus...Show more
    Last updated: 11 days ago • Promoted
    Principal Software Engineer

    Principal Software Engineer

    Paypal • San Jose, California, United States
    Full-time
    PayPal has been revolutionizing commerce globally for more than 25 years.Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empow...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer

    Senior Software Engineer

    Gappify • San Ramon, California, USA
    Full-time
    Gappify founded in 2016 is a cloud-based software provider of accrual automation solutions for mid-market and enterprise accounting teams. The company is headquartered in New York City with offices ...Show more
    Last updated: 3 days ago • Promoted
    #20052 - Solution Architect / Principal Architect

    #20052 - Solution Architect / Principal Architect

    Olenick • San Jose, CA, US
    Full-time
    Are you interested in working with the world's leading AI-powered Quality Engineering company? Ready to advance your career, team up with global thought leaders across industries, and make a differ...Show more
    Last updated: 8 days ago • Promoted
    Software Architect, Virtualization and Kubevirt

    Software Architect, Virtualization and Kubevirt

    Pure Storage • Santa Clara, CA, United States
    Full-time
    We're in an unbelievably exciting area of tech and are fundamentally reshaping the data storage industry.Here, you lead with innovative thinking, grow along with us, and join the smartest team in t...Show more
    Last updated: 17 days ago • Promoted
    Principal ASIC Architect

    Principal ASIC Architect

    Piper Companies • San Jose, CA, United States
    Permanent
    Piper Companies is seeking an ASIC Architect to join a fast-growing innovator in AI infrastructure, for an onsite permanent position in Saratoga, CA. The ASIC Architect will be leading the definitio...Show more
    Last updated: 30+ days ago • Promoted
    Software Architect

    Software Architect

    Artech • Pleasanton, CA, United States
    Full-time
    Location : Pleasanton, CA 94588 ( Hybrid role).Salary : $153,100 to $210,600 (Actual compensation will be provided in writing at the time of offer). Life Sciences Group, where we design innovative dig...Show more
    Last updated: 13 days ago • Promoted