Talent.com
Software Engineer - Model Performance
Software Engineer - Model PerformanceAI Fund • San Francisco, CA, United States
Software Engineer - Model Performance

Software Engineer - Model Performance

AI Fund • San Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

Overview

Join to apply for the Software Engineer - Model Performance role at AI Fund .

Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application.

About Baseten

Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast. Backed by top investors including

Responsibilities

  • Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
  • Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.
  • Apply and scale optimization techniques across a wide range of ML models, particularly large language models.
  • Collaborate with a diverse team to design and implement innovative solutions.
  • Own projects from idea to production.

Qualifications

  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
  • Experience with one or more general-purpose programming languages, such as Python or C++.
  • Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
  • Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
  • Demonstrated interest and experience in LLMs.
  • Deep understanding of GPU architecture.
  • Bonus :

  • Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs).
  • Experience with CUDA or similar technologies.
  • Deep understanding of software engineering principles and a proven track record of developing and deploying AI / ML inference solutions.
  • Experience with Docker and Kubernetes.
  • Benefits

  • Competitive compensation package.
  • This is a unique opportunity to be part of a rapidly growing startup in one of the most exciting engineering fields of our era.
  • An inclusive and supportive work culture that fosters learning and growth.
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
  • Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

    At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

    Job Details

  • Seniority level : Entry level
  • Employment type : Full-time
  • Job function : Engineering and Information Technology
  • Industries : Venture Capital and Private Equity Principals
  • Location : San Francisco, CA

    Salary : $160,000.00-$180,000.00 per year

    #J-18808-Ljbffr

    Create a job alert for this search

    Engineer Performance • San Francisco, CA, United States

    Related jobs
    Software Engineer

    Software Engineer

    General Medicine • Hayward, CA, United States
    Full-time
    As a software engineer at General Medicine, you’ll help build and scale a healthcare store that makes it delightfully simple to shop for any type of care. We provide upfront cash and insurance price...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer (AI Performance)

    Software Engineer (AI Performance)

    Gimlet Labs, Inc • San Francisco, CA, United States
    Full-time
    Gimlet Labs is building the foundation for the next generation of AI applications.As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck.Gimlet is redefi...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer - Large Vision Model Systems

    Senior Software Engineer - Large Vision Model Systems

    Tik Tok • San Jose, CA, United States
    Full-time
    Team Introduction The Intelligent Creation - AI Platform team is a team focusing on building advanced end-to-end AI production pipelines, including deep learning model training, optimization, deplo...Show more
    Last updated: 1 day ago • Promoted
    Staff Software Engineer, Model Serving

    Staff Software Engineer, Model Serving

    Menlo Ventures • San Francisco, CA, United States
    Full-time
    At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...Show more
    Last updated: 11 days ago • Promoted
    Software Engineer - Model API's

    Software Engineer - Model API's

    Baseten • San Francisco, CA, United States
    Full-time
    Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...Show more
    Last updated: 9 days ago • Promoted
    Senior Software Engineer - Backend (Modeling)

    Senior Software Engineer - Backend (Modeling)

    Windfall Data, Inc. • San Francisco, CA, United States
    Full-time
    As a Senior Backend Engineer on our Modeling team at Windfall, you will be the architect and builder of the core infrastructure that powers our machine learning and AI initiatives.You will work in ...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, SystemML - Scaling / Performance

    Software Engineer, SystemML - Scaling / Performance

    META • Menlo Park, CA, United States
    Full-time
    In this role, you will be a member of the Network.AI Software team and part of the bigger DC networking organization.The team develops and owns the software stack around NCCL (NVIDIA Collective Com...Show more
    Last updated: 15 days ago • Promoted
    Software Engineer - Model Training Infrastructure - USDS

    Software Engineer - Model Training Infrastructure - USDS

    Tik Tok • San Jose, CA, United States
    Full-time
    About the team The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in our company.We al...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Model Inference

    Software Engineer, Model Inference

    OpenAI • San Francisco, CA, United States
    Full-time
    Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprise and developers alike to use and access our start-of-the-ar...Show more
    Last updated: 30+ days ago • Promoted
    Senior Firmware EngineerSoftware Engineering • Berkeley, CA • Full time • On-site

    Senior Firmware EngineerSoftware Engineering • Berkeley, CA • Full time • On-site

    Form Energy • Berkeley, CA, United States
    Full-time
    Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...Show more
    Last updated: 29 days ago • Promoted
    Software Engineer, AI Performance Modeling & Co Design

    Software Engineer, AI Performance Modeling & Co Design

    Tesla • Palo Alto, CA, United States
    Full-time
    The AI co-design team is dedicated to developing and optimizing AI systems that can scale efficiently to thousands of compute nodes, enabling large-scale training, reinforcement learning at scale, ...Show more
    Last updated: 1 day ago • Promoted
    Software Engineer - Large Vision Model Systems

    Software Engineer - Large Vision Model Systems

    Tik Tok • San Jose, CA, United States
    Full-time
    Team Introduction The Intelligent Creation - AI Platform team is a team focusing on building advanced end-to-end AI production pipelines, including deep learning model training, optimization, deplo...Show more
    Last updated: 1 day ago • Promoted
    Senior Software Engineer, Datacenter Modeling

    Senior Software Engineer, Datacenter Modeling

    NVIDIA Corporation • Santa Clara, CA, United States
    Full-time
    A love of technology and passion for your work! Imagine crafting AI and computing at one of the most innovative companies in the world. At NVIDIA, we are redefining industries with our groundbreakin...Show more
    Last updated: 28 days ago • Promoted
    Senior Software Engineer, Datacenter Modeling

    Senior Software Engineer, Datacenter Modeling

    NVIDIA • Santa Clara, CA, United States
    Full-time
    A love of technology and passion for your work! Imagine crafting AI and computing at one of the most innovative companies in the world. At NVIDIA, we are redefining industries with our groundbreakin...Show more
    Last updated: 30+ days ago • Promoted
    Staff Software Platform EngineerSoftware Engineering • Berkeley, CA; Somerville, MA; Weirton, WV • Full time • On-site

    Staff Software Platform EngineerSoftware Engineering • Berkeley, CA; Somerville, MA; Weirton, WV • Full time • On-site

    Form Energy • Berkeley, CA, United States
    Full-time
    Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Performance

    Software Engineer, Performance

    Nuro • Mountain View, CA, United States
    Full-time
    Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with automoti...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Model Serving

    Senior Software Engineer, Model Serving

    Menlo Ventures • San Francisco, CA, United States
    Full-time
    At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...Show more
    Last updated: 11 days ago • Promoted
    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Hayward, CA, United States
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show more
    Last updated: 11 days ago • Promoted