Talent.com
AI Performance Engineer
AI Performance EngineerCornelis Networks • San Francisco, CA, United States
No longer accepting applications
AI Performance Engineer

AI Performance Engineer

Cornelis Networks • San Francisco, CA, United States
17 days ago
Job type
  • Full-time
Job description

Cornelis Networks delivers the world's highest performance scale-out networking solutions for AI and HPC datacenters. Our differentiated architecture seamlessly integrates hardware, software and system level technologies to maximize the efficiency of GPU, CPU and accelerator-based compute clusters at any scale. Our solutions drive breakthroughs in AI & HPC workloads, empowering our customers to push the boundaries of innovation. Backed by top-tier venture capital and strategic investors, we are committed to innovation, performance and scalability - solving the world's most demanding computational challenges with our next-generation networking solutions.

We are a fast-growing, forward-thinking team of architects, engineers, and business professionals with a proven track record of building successful products and companies. As a global organization, our team spans multiple U.S. states and six countries, and we continue to expand with exceptional talent in onsite, hybrid, and fully remote roles.

We're seeking an AI Performance Engineer that will optimize training and multi-node inference across next-gen networking silicon and systems-adapters, switches, and the software stack that ties it all together. You'll partner with architecture, firmware, software, and lighthouse customers to turn lab results into field-proven wins with an emphasis on distributed serving architectures and P99-aware optimizations.

Key Responsibilities :

  • Own end-to-end performance for distributed AI workloads (training + multi-node inference) across multi-node clusters and diverse fabrics (Omni-Path, Ethernet, InfiniBand).
  • Benchmark, characterize, and tune open-source & industry workloads (e.g., Llama, Mixtral, diffusion, BERT / T5, MLPerf) on current and future compute, storage, and network hardware, including vLLM / TensorRT-LLM / Triton serving paths.
  • Design and optimize distributed serving topologies (sharded / replicated, tensor / pipe parallel, MoE expert placement), continuous / adaptive batching, KV-cache sharding / offload (CPU / NVMe) & prefix caching, and token streaming with tight p99 / p999 SLOs.
  • Optimize inferencing : Validate RDMA / GPUDirect RDMA, congestion control, and collective / point-to-point tradeoffs during inference.
  • Design experiment plans to isolate scaling bottlenecks (collectives, kernel hot spots, I / O, memory, topology) and deliver clear, actionable deltas with latency-SLO dashboards and queuing analysis.
  • Build crisp proof points that compare Cornelis Omni-Path to competing interconnects; translate data into narratives for sales / marketing and lighthouse customers, including cost-per-token and tokens / sec-per-watt for serving.
  • Instrument and visualize performance (Nsight Systems, ROCm / Omnitrace, VTune, perf, eBPF, RCCL / NCCL tracing, app timers) plus serving telemetry (Prometheus / Grafana, OpenTelemetry traces, concurrency / queue depth).
  • Evangelize best practices through briefs, READMEs, and conference-level presentations on distributed inference patterns and anti-patterns.

Minimum Qualifications :

  • B.S. in CS / EE / CE / Math or related
  • 5-7+ years running AI / ML at cluster scale.
  • Proven ability to set up, run, and analyze AI benchmarks; deep intuition for message passing, collectives, scaling efficiency, and bottleneck hunting for both training and low-latency serving.
  • Hands-on with distributed training beyond single-GPU (DP / TP / PP, ZeRO, FSDP, sharded optimizers) and distributed inference architectures (replicated vs sharded, tensor / KV parallel, MoE).
  • Practical experience across AI stacks & comms : PyTorch, DeepSpeed, Megatron-LM, PyTorch Lightning; RCCL / NCCL, MPI / Horovod; Triton Inference Server, vLLM, TensorRT-LLM, Ray Serve, KServe.
  • Comfortable with compilers (GCC / LLVM / Intel / OneAPI) and MPI stacks; Python + shell power user.
  • Familiarity with network architectures (Omni-Path / OPA, InfiniBand, Ethernet / RDMA / ROCE) and Linux systems at the performance-tuning level, including NIC offloads, CQ moderation, pacing, ECN / RED.
  • Excellent written and verbal communication-turn measurements into
  • persuasion with SLO-driven narratives for inference.

    Preferred Qualifications :

  • M.S. in CS / EE / CE / Math or related
  • Scheduler expertise (SLURM, PBS) and multi-tenant cluster ops.
  • Hands-on profiling & tracing of GPU / comm paths (Nsight Systems, Nsight Compute, ROCm tools / rocprof / roctracer / omnitrace, VTune, perf, PCP, eBPF).
  • Experience with NeMo, DeepSpeed, Megatron-LM, FSDP, and collective ops analysis (AllReduce / AllGather / ReduceScatter / Broadcast).
  • Background in HPC performance engineering or storage (BeeGFS, Lustre, NVMeoF) for data & checkpoint pipelines.
  • Location : This is a remote position for employees residing within the United States.

    We offer a competitive compensation package that includes equity, cash, and incentives, along with health and retirement benefits. Our dynamic, flexible work environment provides the opportunity to collaborate with some of the most influential names in the semiconductor industry.

    At Cornelis Networks your base salary is only one component of your comprehensive total rewards package. Your base pay will be determined by factors such as your skills, qualifications, experience, and location relative to the hiring range for the position. Depending on your role, you may also be eligible for performance-based incentives, including an annual bonus or sales incentives.

    In addition to your base pay, you'll have access to a broad range of benefits, including medical, dental, and vision coverage, as well as disability and life insurance, a dependent care flexible spending account, accidental injury insurance, and pet insurance. We also offer generous paid holidays, 401(k) with company match, and Open Time Off (OTO) for regular full-time exempt employees. Other paid time off benefits include sick time, bonding leave, and pregnancy disability leave.

    Cornelis Networks does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. Cornelis Networks is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

    Create a job alert for this search

    Performance Engineer • San Francisco, CA, United States

    Similar jobs
    Senior AI Full Stack Engineer

    Senior AI Full Stack Engineer

    Mindlance • South San Francisco, CA, US
    Full-time
    Advance your career with Mindlance!.We have been connecting talented IT professionals with world-class companies since 1999. Mindlance is here to help you to find the perfect fit with just the right...Show more
    Last updated: 7 hours ago • Promoted • New!
    AI Engineer, Enterprise Productivity & Digital Experience

    AI Engineer, Enterprise Productivity & Digital Experience

    Planet • San Francisco, CA, United States
    Full-time
    A leading space and data company in San Francisco is seeking an Applied AI Engineer to enhance the digital employee experience. This full-time hybrid role involves leveraging AI to streamline workfl...Show more
    Last updated: 10 days ago • Promoted
    AI Solutions Engineer for Risk & Compliance

    AI Solutions Engineer for Risk & Compliance

    Roe • San Francisco, CA, United States
    Full-time
    A tech startup based in San Francisco is seeking a Solutions Engineer to bridge its AI-driven data platform with customer challenges in risk and compliance. The ideal candidate will work with esteem...Show more
    Last updated: 10 days ago • Promoted
    Senior Solutions Engineer, Mid-Market (AI-Driven)

    Senior Solutions Engineer, Mid-Market (AI-Driven)

    Box, Inc. • Redwood City, CA, United States
    Full-time
    A leading content management company is seeking a Solutions Engineer to empower business and IT leaders in implementing customer-centric solutions. This role involves extensive collaboration with ac...Show more
    Last updated: 10 days ago • Promoted
    Architect for Data and AI Engines

    Architect for Data and AI Engines

    Oracle • Redwood City, CA, United States
    Full-time
    Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.AI and Data Management Architect. AI & Data Platform Engineering.Are you ready to shape the...Show more
    Last updated: 10 days ago • Promoted
    Senior AI / ML Performance Engineer – Microservices & Cloud

    Senior AI / ML Performance Engineer – Microservices & Cloud

    Salesforce, Inc. • San Francisco, CA, United States
    Full-time
    A leading technology company in San Francisco is seeking a seasoned professional with expertise in AI / ML, performance engineering, and cloud environments. This role involves enabling Agentic workflo...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Engineer - Agentic Systems

    Senior AI Engineer - Agentic Systems

    Sweya Information Technologies LLP • San Francisco, CA, United States
    Full-time
    Build autonomous AI agents that form feedback-driven, self-improving systems for enterprise operations.We're excited to learn more about you and how you can contribute to our team.Please fill out t...Show more
    Last updated: 30+ days ago • Promoted
    Senior Performance Engineer : High-Scale AI Systems

    Senior Performance Engineer : High-Scale AI Systems

    OpenAI • San Francisco, CA, United States
    Full-time
    A leading AI research organization in San Francisco is seeking an experienced Performance Engineer to enhance system performance and reliability. The role involves collaboration across teams to opti...Show more
    Last updated: 30+ days ago • Promoted
    Senior Performance Modelling Engineer AI / Hardware Simulator

    Senior Performance Modelling Engineer AI / Hardware Simulator

    PageBolt WordPress • San Francisco, CA, United States
    Full-time
    A leading technology company is seeking a Staff Performance Modelling Engineer in San Francisco to create and own analytical models influencing software and hardware evolution.The role involves sig...Show more
    Last updated: 30+ days ago • Promoted
    AI Engineer (Platform) (£70,000-£100,000 + Equity) at nPlan

    AI Engineer (Platform) (£70,000-£100,000 + Equity) at nPlan

    Jack & Jill • San Francisco, CA, United States
    Full-time
    AI Engineer (Platform) (£70,000-£100,000 + Equity) at nPlan.This is a role to drive the design and implementation of cutting‑edge AI features across nPlan’s product suite, taking ideas from incepti...Show more
    Last updated: 12 days ago • Promoted
    GPU Performance Engineer : Scale AI Inference

    GPU Performance Engineer : Scale AI Inference

    Anthropic • San Francisco, CA, United States
    Full-time
    A leading AI research company in San Francisco is seeking a mid-senior GPU Performance Engineer.In this role, you'll architect systems that enhance GPU performance for groundbreaking AI models.Resp...Show more
    Last updated: 12 days ago • Promoted
    Solutions Engineer - Data & AI Platform (Pre-Sales)

    Solutions Engineer - Data & AI Platform (Pre-Sales)

    The Rundown AI, Inc. • San Francisco, CA, United States
    Full-time
    A leading data and AI company is seeking a Senior Solutions Engineer to join their team in San Francisco.The role involves selling solutions, promoting the platform through events, and working on t...Show more
    Last updated: 10 days ago • Promoted
    Staff Engineer - Agentic AI

    Staff Engineer - Agentic AI

    Artera • San Francisco, CA, United States
    Full-time
    Make healthcare #1 in customer service.Artera, a SaaS leader in digital health, transforms patient experience with AI-powered virtual agents (voice and text) for every step of the patient journey.T...Show more
    Last updated: 12 hours ago • Promoted • New!
    AI Engineer

    AI Engineer

    Ironclad Inc. • San Francisco, CA, United States
    Full-time
    Ironclad is the leading AI contracting platform that transforms agreements into assets.Contracts move faster, insights surface instantly, and agents push work forward, all with you in control.Wheth...Show more
    Last updated: 6 days ago • Promoted
    AI Engineer

    AI Engineer

    The Mortgage Office (Applied Business Software Inc.,) • San Mateo, CA, United States
    Full-time
    The Mortgage Office (TMO) is the leading B2B fintech platform serving the private lending industry.Our software helps private lenders, fund managers, municipalities, and non-profits originate and s...Show more
    Last updated: 12 hours ago • Promoted • New!
    AI Engineer

    AI Engineer

    Langchain • San Francisco, CA, United States
    Full-time
    We're looking for an AI Engineer to join our Professional Services team.You'll work directly with enterprise customers to design, build, and optimize production-grade AI agent systems.This role com...Show more
    Last updated: 21 days ago • Promoted
    Senior AI / ML Tooling & Performance Engineer

    Senior AI / ML Tooling & Performance Engineer

    General Motors • San Francisco, CA, United States
    Full-time
    A leading automotive company is seeking a Senior AI / ML Tooling Engineer in San Francisco, California.In this role, you will develop internal ML tooling to optimize model training and inference, wor...Show more
    Last updated: 30+ days ago • Promoted
    AI Engineer

    AI Engineer

    Monograph • San Francisco, CA, United States
    Full-time
    At Quanta, we’re tackling one of the most common frustrations business leaders face : getting answers about their finances is tediously slow, and data is always stale. Today’s status quo is a cycle o...Show more
    Last updated: 30+ days ago • Promoted