Talent.com
Staff AI Engineer, Inference & Optimization

Staff AI Engineer, Inference & Optimization

SonatusSunnyvale, California, United States
30+ days ago
Job type
  • Full-time
Job description

Sonatus is a well-funded, fast-paced, and rapidly growing company whose software products and solutions help automakers build dynamic software-defined vehicles. With over four million vehicles already on the road with top global OEM brands, our vehicle and cloud software solutions are at the forefront of automotive digital transformation. The Sonatus team is a talented and diverse collection of technology and automotive specialists hailing from many of the most prominent companies in their respective industries.

The Opportunity :

We're looking for a highly skilled and experienced Staff AI Engineer  with domain expertise in optimizing AI models for production Edge environments. You’ll own the full lifecycle of model inference and hardware acceleration , from initial optimization to large-scale deployment. In this role, you will be a key contributor to our team, ensuring our AI solutions are not just functional but also incredibly fast, efficient, and reliable on various inference hardware platforms.

Role and Responsibilities :

  • Design, build, and maintain robust pipelines and runtime environments for deploying and serving machine learning models at the Edge. Ensure high availability, low latency, and efficient resource utilization for inference at scale.
  • Collaborate with researchers and hardware engineers to optimize models for performance, latency, and power consumption on specific hardware, including GPUs, TPUs, NPUs, and FPGAs. This includes a strong focus on inference optimization techniques like quantization, pruning, and knowledge distillation.
  • Use of AI compilers and specialized software stacks (e.g., TensorRT, OpenVINO, TVM) to accelerate model execution, ensuring models are compiled and optimized for peak performance on target hardware.
  • Design, build, and maintain MLOps pipelines for deploying models to various edge devices (e.g., highly integrated vehicle compute), with a specific focus on performance and efficiency constraints.
  • Implement and maintain monitoring and alerting systems to track model performance, data drift, and overall model health in production.
  • Work with cloud platforms and on-device environments to provision and manage the necessary infrastructure for scalable and reliable model serving.
  • Proactively identify and resolve issues related to model performance, deployment failures, and data discrepancies, with a specific focus on inference bottlenecks.
  • Work closely with Machine Learning Engineers, Software Engineers, and Product Managers to bring models from design to high-performance production systems.

Qualifications :

  • Minimum 7 years of work experience in MLOps or a similar role with a strong focus on high-performance machine learning systems.
  • Proven experience with inference optimization techniques such as quantization (INT8, FP16), pruning, and model distillation.
  • Deep hands-on experience with hardware acceleration for machine learning, including familiarity with GPUs, TPUs, NPUs and related software ecosystems.
  • Strong experience with AI compilers and runtime environments like TensorRT, OpenVINO, and TVM.
  • Proven experience deploying and managing ML models on edge devices (e.g., NVIDIA Jetson, Raspberry Pi, NXP, Renesas).
  • Strong experience in designing and building distributed systems. Proficiency with inter-process communication protocols like gRPC, message queuing systems like MQTT, and efficient data handling techniques such as buffering and callbacks.
  • Hands-on experience with popular ML frameworks such as PyTorch, TensorFlow, TFLite, and ONNX.
  • Proficiency in programming languages, including Python and C++.
  • Solid understanding of machine learning concepts, the ML development lifecycle, and the challenges of deploying models at scale.
  • Proficiency with containerization technologies (Docker, Kubernetes) and cloud platforms (AWS, Azure).
  • Expertise in CI / CD principles and tools applied to machine learning workflows.
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related quantitative field.
  • Benefits :

    Sonatus is a tight-knit team aligned around a unified vision. You can expect a strong engineering-oriented culture that focuses on building the best products and solutions for our customers. We embrace equality and diversity in all regards because respect is ingrained in our every fiber. Other benefits Sonatus offers include :

  • Stock option plan
  • Health care plan (Medical, Dental & Vision)
  • Retirement plan (401k, IRA)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Unlimited paid time off (Vacation, Sick & Public Holidays)
  • Family leave (Maternity, Paternity)
  • Flexible work arrangements
  • Free food & snacks in office
  • The posted salary range is a general guideline and represents a good faith estimate of what Sonatus ("Company") could reasonably expect to pay for a base salary for this position. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, geographic location and external market pay for comparable jobs. The Company reserves the right to modify this range in the future, as needed, as market conditions change.

    Pay range for this role

    $197,500 - $260,000 USD

    Sonatus is a fast-paced and innovative company and are seeking team members who are passionate about making a difference. If you are ready to take your career to the next level, we highly encourage you to apply.

    To all recruitment agencies : Sonatus, Inc. ("Sonatus") does not accept unsolicited agency resumes. Please do not forward resumes to our careers alias or other Sonatus' employees. Sonatus is not responsible for any fees associated with unsolicited activities.

    Create a job alert for this search

    Staff Ai Engineer • Sunnyvale, California, United States

    Related jobs
    • Promoted
    Staff / Senior AI Engineer

    Staff / Senior AI Engineer

    Airwallex Pty Ltd.San Francisco, CA, United States
    Full-time
    Airwallex is the only unified payments and financial platform for global businesses.Powered by our unique combination of proprietary infrastructure and software, we empower over 150,000 businesses ...Show moreLast updated: 30+ days ago
    • Promoted
    Staff AI Engineer

    Staff AI Engineer

    You.comSan Francisco, CA, United States
    Full-time
    AI-powered search and productivity platform designed to empower users with personalized, efficient, and trustworthy search experiences. As a cutting-edge technology company, we combine advanced AI m...Show moreLast updated: 30+ days ago
    • Promoted
    Sr. Staff Software Engineer - AI + Data Intelligence Platform

    Sr. Staff Software Engineer - AI + Data Intelligence Platform

    Menlo VenturesSan Francisco, CA, United States
    Full-time
    At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...Show moreLast updated: 4 days ago
    • Promoted
    • New!
    Staff AI & Data Platform Engineer

    Staff AI & Data Platform Engineer

    QuizletSan Francisco, CA, United States
    Full-time
    About Quizlet : At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. Our $1B+ learning platform serves tens of millions of students every ...Show moreLast updated: 1 hour ago
    • Promoted
    Staff Machine Learning Engineer - Responsible AI

    Staff Machine Learning Engineer - Responsible AI

    PinterestSan Francisco, CA, United States
    Full-time
    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show moreLast updated: 30+ days ago
    • Promoted
    Staff AI & ML Engineer

    Staff AI & ML Engineer

    SunrunSan Francisco, CA, United States
    Full-time
    Ever since we started in 2007, Sunrun has been at the forefront of connecting people to the cleanest energy on Earth.It's why we've become the #1 home solar and battery company in America.Today, we...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Software Engineer - AI Agent Infrastructure (Healthcare)

    Staff Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey HealthHayward, CA, United States
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show moreLast updated: 6 days ago
    • Promoted
    Staff AI Engineer

    Staff AI Engineer

    AirwallexSan Francisco, CA, United States
    Full-time
    Airwallex is the only unified payments and financial platform for global businesses.Powered by our unique combination of proprietary infrastructure and software, we empower over 150,000 businesses ...Show moreLast updated: 23 days ago
    • Promoted
    Staff AI Agent Engineer

    Staff AI Agent Engineer

    Zendesk, Inc.San Francisco, CA, United States
    Full-time
    Job DescriptionThe Agentic Tribe is revolutionizing the chatbot and voice assistance landscape with Gen3, a cutting-edge AI Agent system that's pushing the boundaries of conversational AI.Gen3 isn'...Show moreLast updated: 1 day ago
    • Promoted
    Staff Backend Engineer - Core AI

    Staff Backend Engineer - Core AI

    AnzenSan Francisco, CA, US
    Full-time
    Anzen was founded by former insurance and tech executives who’ve collectively scaled three unicorns.Backed by leading investors like Andreessen Horowitz (a16z) and Madrona, we’re on a m...Show moreLast updated: 30+ days ago
    • Promoted
    Staff AI Platform Engineer

    Staff AI Platform Engineer

    BetterCloudSan Francisco, CA, United States
    Full-time
    The world’s most sophisticated companies rely on AlphaSense to remove uncertainty from decision-making.With market intelligence and search built on proven AI, AlphaSense delivers insights that matt...Show moreLast updated: 19 days ago
    • Promoted
    Senior / Staff Applied AI Engineer

    Senior / Staff Applied AI Engineer

    QuizletSan Francisco, CA, United States
    Full-time
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, includin...Show moreLast updated: 1 day ago
    • Promoted
    Staff Machine Learning Engineer, AI

    Staff Machine Learning Engineer, AI

    SentrySan Francisco, CA, United States
    Full-time
    Bad software is everywhere, and were tired of it.Sentry is on a mission to help developers write better software faster so we can get back to enjoying technology. With more than $217 million in fund...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Staff AI Research Engineer, On-Device Personal Intelligence

    Senior Staff AI Research Engineer, On-Device Personal Intelligence

    Samsung Research AmericaMountain View, California, United States
    Full-time
    Samsung AI Research Center (AIC) located in Mountain View, California, is currently recruiting outstanding scientists for the Language and Personal Intelligence l ab. Our goal is to perform research...Show moreLast updated: 23 hours ago
    • Promoted
    Sr. Staff Engineer

    Sr. Staff Engineer

    Bio-Rad LaboratoriesPleasanton, CA, United States
    Full-time
    You'll drive the development of hardware products that directly impact healthcare innovation and improve lives worldwide. You'll collaborate cross-functionally to.Your expertise in electrical engine...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior / Staff Applied AI Engineer, Agents

    Senior / Staff Applied AI Engineer, Agents

    Scale AISan Francisco, California, United States
    Full-time
    Senior / Staff Applied AI Engineer, Autonomous Agents.Scale AI Join to apply for the.Senior / Staff Applied AI Engineer, Autonomous Agents. At Scale, our mission is to accelerate the development of AI...Show moreLast updated: 1 hour ago
    • Promoted
    Sr. Staff Applied AI Engineer

    Sr. Staff Applied AI Engineer

    Icon VenturesSan Francisco, CA, United States
    Full-time
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, includin...Show moreLast updated: 22 days ago
    • Promoted
    Staff Applied AI Engineer

    Staff Applied AI Engineer

    HarveySan Francisco, CA, United States
    Full-time
    At Harvey, we’re transforming how legal and professional services operate — not incrementally, but end-to-end.By combining frontier agentic AI, an enterprise-grade platform, and deep domain experti...Show moreLast updated: 30+ days ago