AI Performance EngineerCornelis Networks • San Francisco, CA, United States

No longer accepting applications

AI Performance Engineer

Cornelis Networks • San Francisco, CA, United States

17 days ago

Job type

Full-time

Job description

Cornelis Networks delivers the world's highest performance scale-out networking solutions for AI and HPC datacenters. Our differentiated architecture seamlessly integrates hardware, software and system level technologies to maximize the efficiency of GPU, CPU and accelerator-based compute clusters at any scale. Our solutions drive breakthroughs in AI & HPC workloads, empowering our customers to push the boundaries of innovation. Backed by top-tier venture capital and strategic investors, we are committed to innovation, performance and scalability - solving the world's most demanding computational challenges with our next-generation networking solutions.

We are a fast-growing, forward-thinking team of architects, engineers, and business professionals with a proven track record of building successful products and companies. As a global organization, our team spans multiple U.S. states and six countries, and we continue to expand with exceptional talent in onsite, hybrid, and fully remote roles.

We're seeking an AI Performance Engineer that will optimize training and multi-node inference across next-gen networking silicon and systems-adapters, switches, and the software stack that ties it all together. You'll partner with architecture, firmware, software, and lighthouse customers to turn lab results into field-proven wins with an emphasis on distributed serving architectures and P99-aware optimizations.

Key Responsibilities :

Own end-to-end performance for distributed AI workloads (training + multi-node inference) across multi-node clusters and diverse fabrics (Omni-Path, Ethernet, InfiniBand).
Benchmark, characterize, and tune open-source & industry workloads (e.g., Llama, Mixtral, diffusion, BERT / T5, MLPerf) on current and future compute, storage, and network hardware, including vLLM / TensorRT-LLM / Triton serving paths.
Design and optimize distributed serving topologies (sharded / replicated, tensor / pipe parallel, MoE expert placement), continuous / adaptive batching, KV-cache sharding / offload (CPU / NVMe) & prefix caching, and token streaming with tight p99 / p999 SLOs.
Optimize inferencing : Validate RDMA / GPUDirect RDMA, congestion control, and collective / point-to-point tradeoffs during inference.
Design experiment plans to isolate scaling bottlenecks (collectives, kernel hot spots, I / O, memory, topology) and deliver clear, actionable deltas with latency-SLO dashboards and queuing analysis.
Build crisp proof points that compare Cornelis Omni-Path to competing interconnects; translate data into narratives for sales / marketing and lighthouse customers, including cost-per-token and tokens / sec-per-watt for serving.
Instrument and visualize performance (Nsight Systems, ROCm / Omnitrace, VTune, perf, eBPF, RCCL / NCCL tracing, app timers) plus serving telemetry (Prometheus / Grafana, OpenTelemetry traces, concurrency / queue depth).
Evangelize best practices through briefs, READMEs, and conference-level presentations on distributed inference patterns and anti-patterns.

Minimum Qualifications :

B.S. in CS / EE / CE / Math or related

5-7+ years running AI / ML at cluster scale.

Proven ability to set up, run, and analyze AI benchmarks; deep intuition for message passing, collectives, scaling efficiency, and bottleneck hunting for both training and low-latency serving.

Hands-on with distributed training beyond single-GPU (DP / TP / PP, ZeRO, FSDP, sharded optimizers) and distributed inference architectures (replicated vs sharded, tensor / KV parallel, MoE).

Practical experience across AI stacks & comms : PyTorch, DeepSpeed, Megatron-LM, PyTorch Lightning; RCCL / NCCL, MPI / Horovod; Triton Inference Server, vLLM, TensorRT-LLM, Ray Serve, KServe.

Comfortable with compilers (GCC / LLVM / Intel / OneAPI) and MPI stacks; Python + shell power user.

Familiarity with network architectures (Omni-Path / OPA, InfiniBand, Ethernet / RDMA / ROCE) and Linux systems at the performance-tuning level, including NIC offloads, CQ moderation, pacing, ECN / RED.

Excellent written and verbal communication-turn measurements into

persuasion with SLO-driven narratives for inference.

Preferred Qualifications :

M.S. in CS / EE / CE / Math or related

Scheduler expertise (SLURM, PBS) and multi-tenant cluster ops.

Hands-on profiling & tracing of GPU / comm paths (Nsight Systems, Nsight Compute, ROCm tools / rocprof / roctracer / omnitrace, VTune, perf, PCP, eBPF).

Experience with NeMo, DeepSpeed, Megatron-LM, FSDP, and collective ops analysis (AllReduce / AllGather / ReduceScatter / Broadcast).

Background in HPC performance engineering or storage (BeeGFS, Lustre, NVMeoF) for data & checkpoint pipelines.

Location : This is a remote position for employees residing within the United States.

We offer a competitive compensation package that includes equity, cash, and incentives, along with health and retirement benefits. Our dynamic, flexible work environment provides the opportunity to collaborate with some of the most influential names in the semiconductor industry.

At Cornelis Networks your base salary is only one component of your comprehensive total rewards package. Your base pay will be determined by factors such as your skills, qualifications, experience, and location relative to the hiring range for the position. Depending on your role, you may also be eligible for performance-based incentives, including an annual bonus or sales incentives.

In addition to your base pay, you'll have access to a broad range of benefits, including medical, dental, and vision coverage, as well as disability and life insurance, a dependent care flexible spending account, accidental injury insurance, and pet insurance. We also offer generous paid holidays, 401(k) with company match, and Open Time Off (OTO) for regular full-time exempt employees. Other paid time off benefits include sick time, bonding leave, and pregnancy disability leave.

Cornelis Networks does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. Cornelis Networks is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

Create a job alert for this search

Performance Engineer • San Francisco, CA, United States

Similar jobs

Senior AI Full Stack Engineer

Mindlance • South San Francisco, CA, US

Full-time

Advance your career with Mindlance!.We have been connecting talented IT professionals with world-class companies since 1999. Mindlance is here to help you to find the perfect fit with just the right...Show more

Last updated: 7 hours ago • Promoted • New!

AI Engineer, Enterprise Productivity & Digital Experience

Planet • San Francisco, CA, United States

Full-time

A leading space and data company in San Francisco is seeking an Applied AI Engineer to enhance the digital employee experience. This full-time hybrid role involves leveraging AI to streamline workfl...Show more

Last updated: 10 days ago • Promoted

AI Solutions Engineer for Risk & Compliance

Roe • San Francisco, CA, United States

Full-time

A tech startup based in San Francisco is seeking a Solutions Engineer to bridge its AI-driven data platform with customer challenges in risk and compliance. The ideal candidate will work with esteem...Show more

Last updated: 10 days ago • Promoted

Senior Solutions Engineer, Mid-Market (AI-Driven)

Box, Inc. • Redwood City, CA, United States

Full-time

A leading content management company is seeking a Solutions Engineer to empower business and IT leaders in implementing customer-centric solutions. This role involves extensive collaboration with ac...Show more

Last updated: 10 days ago • Promoted

Architect for Data and AI Engines

Oracle • Redwood City, CA, United States

Full-time

Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.AI and Data Management Architect. AI & Data Platform Engineering.Are you ready to shape the...Show more

Last updated: 10 days ago • Promoted

Senior AI / ML Performance Engineer – Microservices & Cloud

Salesforce, Inc. • San Francisco, CA, United States

Full-time

A leading technology company in San Francisco is seeking a seasoned professional with expertise in AI / ML, performance engineering, and cloud environments. This role involves enabling Agentic workflo...Show more

Last updated: 30+ days ago • Promoted

Senior AI Engineer - Agentic Systems

Sweya Information Technologies LLP • San Francisco, CA, United States

Full-time

Build autonomous AI agents that form feedback-driven, self-improving systems for enterprise operations.We're excited to learn more about you and how you can contribute to our team.Please fill out t...Show more

Last updated: 30+ days ago • Promoted

Senior Performance Engineer : High-Scale AI Systems

OpenAI • San Francisco, CA, United States

Full-time

A leading AI research organization in San Francisco is seeking an experienced Performance Engineer to enhance system performance and reliability. The role involves collaboration across teams to opti...Show more

Last updated: 30+ days ago • Promoted

Senior Performance Modelling Engineer AI / Hardware Simulator

PageBolt WordPress • San Francisco, CA, United States

Full-time

A leading technology company is seeking a Staff Performance Modelling Engineer in San Francisco to create and own analytical models influencing software and hardware evolution.The role involves sig...Show more

Last updated: 30+ days ago • Promoted

AI Engineer (Platform) (£70,000-£100,000 + Equity) at nPlan

Jack & Jill • San Francisco, CA, United States

Full-time

AI Engineer (Platform) (£70,000-£100,000 + Equity) at nPlan.This is a role to drive the design and implementation of cutting‑edge AI features across nPlan’s product suite, taking ideas from incepti...Show more

Last updated: 12 days ago • Promoted

GPU Performance Engineer : Scale AI Inference

Anthropic • San Francisco, CA, United States

Full-time

A leading AI research company in San Francisco is seeking a mid-senior GPU Performance Engineer.In this role, you'll architect systems that enhance GPU performance for groundbreaking AI models.Resp...Show more

Last updated: 12 days ago • Promoted

Solutions Engineer - Data & AI Platform (Pre-Sales)

The Rundown AI, Inc. • San Francisco, CA, United States

Full-time

A leading data and AI company is seeking a Senior Solutions Engineer to join their team in San Francisco.The role involves selling solutions, promoting the platform through events, and working on t...Show more

Last updated: 10 days ago • Promoted

Staff Engineer - Agentic AI

Artera • San Francisco, CA, United States

Full-time

Make healthcare #1 in customer service.Artera, a SaaS leader in digital health, transforms patient experience with AI-powered virtual agents (voice and text) for every step of the patient journey.T...Show more

Last updated: 12 hours ago • Promoted • New!

AI Engineer

Ironclad Inc. • San Francisco, CA, United States

Full-time

Ironclad is the leading AI contracting platform that transforms agreements into assets.Contracts move faster, insights surface instantly, and agents push work forward, all with you in control.Wheth...Show more

Last updated: 6 days ago • Promoted

AI Engineer

The Mortgage Office (Applied Business Software Inc.,) • San Mateo, CA, United States

Full-time

The Mortgage Office (TMO) is the leading B2B fintech platform serving the private lending industry.Our software helps private lenders, fund managers, municipalities, and non-profits originate and s...Show more

Last updated: 12 hours ago • Promoted • New!

AI Engineer

Langchain • San Francisco, CA, United States

Full-time

We're looking for an AI Engineer to join our Professional Services team.You'll work directly with enterprise customers to design, build, and optimize production-grade AI agent systems.This role com...Show more

Last updated: 21 days ago • Promoted

Senior AI / ML Tooling & Performance Engineer

General Motors • San Francisco, CA, United States

Full-time

A leading automotive company is seeking a Senior AI / ML Tooling Engineer in San Francisco, California.In this role, you will develop internal ML tooling to optimize model training and inference, wor...Show more

Last updated: 30+ days ago • Promoted

AI Engineer

Monograph • San Francisco, CA, United States

Full-time

At Quanta, we’re tackling one of the most common frustrations business leaders face : getting answers about their finances is tediously slow, and data is always stale. Today’s status quo is a cycle o...Show more

Last updated: 30+ days ago • Promoted