Talent.com
Senior ML Inference Platform Engineer

Senior ML Inference Platform Engineer

AIONSeattle, WA, US
Hace 15 días
Tipo de contrato
  • A tiempo completo
  • Quick Apply
Descripción del trabajo

About AION

AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance, AION democratizes access to compute power for AI training, fine-tuning, inference, data labeling, and full stack AI / ML lifecycle.

Led by high-pedigree founders with previous exits, AION is well-funded by major VCs with strategic global partnerships. Headquartered in the US with global presence, the company is building its initial core team across India, London and Seattle.

Who You Are

You're an ML systems engineer who's passionate about building high-performance inference infrastructure. You don't need to be an expert in everything - this field is evolving too rapidly for that - but you have strong fundamentals and the curiosity to dive deep into optimization challenges. You thrive in early-stage environments where you'll learn cutting-edge techniques while building production systems. You think systematically about performance bottlenecks and are excited to push the boundaries of what's possible in AI infrastructure.

Requirements

Key Responsibilities

  • Build and optimize LLM inference systems working towards 2-4x performance improvements over standard frameworks like vLLM and TensorRT-LLM.
  • Implement modern inference optimizations including KV-cache management, dynamic batching, speculative decoding, compression and quantization strategies.
  • Develop GPU optimization solutions using CUDA, with opportunities to learn advanced techniques like Triton kernel development and CUDA graphs.
  • Design model evaluation and benchmarking systems to assess performance across reasoning, coding, and safety metrics.
  • Research and integrate trending open-source models (DeepSeek R1, Qwen 3, Llama 4, Mistral variants) with optimized configurations.
  • Build performance monitoring and profiling tools for GPU cluster analysis, bottleneck identification, and cost optimization.
  • Create cost-performance optimization strategies that balance throughput, latency, and infrastructure costs.
  • Explore agent orchestration capabilities for multi-step reasoning and tool integration workflows.
  • Collaborate with tech and product teams to identify optimization opportunities and translate them into production improvements.

Skills & Experience

  • High agency individual looking to own and influence product architecture and company direction
  • 3+ years of software engineering experience with focus on performance-critical systems and production deployments.
  • Strong Python expertise and working knowledge of C++ for performance optimization.
  • Working understanding of deep learning fundamentals including transformer architectures, attention mechanisms, and neural network training / inference.
  • Hands-on experience of model serving and deployment techniques.
  • Experience with at least one modern inference framework (vLLM, TensorRT-LLM, SGLang or similar) in a production setting.
  • Hands-on experience with PyTorch including model development, training loops, and basic distributed computing concepts.
  • Understanding of distributed systems concepts including load balancing, auto-scaling, and fault tolerance.
  • Basic GPU programming experience with CUDA or willingness to quickly learn GPU optimization techniques.
  • Strong debugging and performance profiling skills for identifying and resolving system bottlenecks.
  • Benefits

  • Join the ground floor of a mission-driven AI startup revolutionizing compute infrastructure.
  • Work with a high-caliber, globally distributed team backed by major VCs.
  • Competitive compensation and benefits.
  • Fast-paced, flexible work environment with room for ownership and impact.
  • Hybrid model : 3 days in-office, 2 days remote with flexibility to work remotely for part of the year.
  • In case you got any questions about the role please reach out to hiring manager on linkedin or X .

    Crear una alerta de empleo para esta búsqueda

    Senior Engineer Ml • Seattle, WA, US

    Ofertas relacionadas
    Microsoft is hiring : Senior UX Researcher in Redmond

    Microsoft is hiring : Senior UX Researcher in Redmond

    MediabistroRedmond, WA, United States
    A tiempo completo
    Senior UX Researcher - AI Design & Research.If you’re passionate about how technology can improve people’s lives, like exploring new ideas and product possibilities, and lead with a growth mindset,...Mostrar másÚltima actualización: hace 29 días
    • Oferta promocionada
    Content Designer (AI engineer)

    Content Designer (AI engineer)

    AquentRedmond, WA, US
    A tiempo completo
    Are you fascinated by the AI transformation that is well under way? Are you eager to bring your writing skills to help create unprecedented AI-powered experiences for people across the globe? Are y...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Backend Data & Snowflake Engineer

    Backend Data & Snowflake Engineer

    3MD Inc.Redmond, WA, US
    A tiempo completo
    Additionally, eligible hourly / non-exempt and exempt employees accrue up to 112 hours of PTO based on years of service and may annually take up to 8 hours of paid volunteer time.Additional paid sick...Mostrar másÚltima actualización: hace 9 días
    Machine Learning Frontier Scientist - AI Drug Discovery

    Machine Learning Frontier Scientist - AI Drug Discovery

    SystimmuneRedmond, WA, US
    A tiempo completo
    Quick Apply
    SystImmune is a leading and well-funded clinical-stage biopharmaceutical company located in Redmond, WA and Princeton, NJ. It specializes in developing innovative cancer treatments using its establi...Mostrar másÚltima actualización: hace 1 día
    Tubi Tv is hiring : Machine Learning Engineer in Seattle

    Tubi Tv is hiring : Machine Learning Engineer in Seattle

    MediabistroSeattle, WA, United States
    A tiempo completo
    Tubi is a global entertainment company and the most watched free TV and movie streaming service in the U.Dedicated to providing all people access to all the world’s stories, Tubi offers the largest...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Machine Learning Engineer

    Machine Learning Engineer

    Apple Inc.Seattle, WA, United States
    A tiempo completo
    Seattle, Washington, United States.Apple's Video Computer Vision (VCV) Face and Body technologies team is seeking a skilled Machine Learning Engineer with experience in developing ML models for com...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    AIML-Sr. On-Device Machine Learning Engineer, Measurement

    AIML-Sr. On-Device Machine Learning Engineer, Measurement

    Apple Inc.Seattle, WA, United States
    A tiempo completo
    On-Device Machine Learning Engineer, Measurement.Seattle, Washington, United States Software and Services.We are looking for an experienced on-device machine learning engineer to join our team and ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Machine Learning Engineer

    Machine Learning Engineer

    Zachary Piper SolutionsSeattle, WA, United States
    A tiempo completo
    Zachary Piper Solutions is seeking a highly skilled and experienced,.AI / Machine Learning Engineer.Federal Engineering team focused on. In this role, you'll help architect and deploy cutting-edge, ...Mostrar másÚltima actualización: hace 9 días
    VideoGuru is hiring : Senior Machine Learning Engineer in Seattle

    VideoGuru is hiring : Senior Machine Learning Engineer in Seattle

    MediabistroSeattle, WA, United States
    A tiempo completo
    Role Description This is a full-time hybrid role for a Senior Machine Learning Engineer located in Seattle, WA.The role will involve working both on-site and remotely. Responsibilities include desi...Mostrar másÚltima actualización: hace 2 días
    • Oferta promocionada
    Data Engineer / Data Architect

    Data Engineer / Data Architect

    AiritosIssaquah, WA, US
    A tiempo completo
    Data Architects define standards and design the flow of data throughout both the enterprise and the external ecosystem (customers, channels, etc. They work closely with users, systems designers, and...Mostrar másÚltima actualización: hace más de 30 días
    META is hiring : UX Researcher, Mixed Methods in Redmond

    META is hiring : UX Researcher, Mixed Methods in Redmond

    MediabistroRedmond, WA, United States
    A tiempo completo
    Our UX Research team is designing for the broad spectrum of human needs, which requires us to understand the behaviors of the people behind them. Our researchers tackle some of the most complex chal...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Database Architect

    Database Architect

    Compunnel Inc.Redmond, WA, US
    A tiempo completo
    IT experience, including more than 5 years specializing in Azure Database Migration, effectively led, managed, and governed technical teams. Understanding customer DB SQL server landscape.Configurin...Mostrar másÚltima actualización: hace 3 días
    • Oferta promocionada
    PhD University Grad Machine Learning Engineer 2026 (USA)

    PhD University Grad Machine Learning Engineer 2026 (USA)

    PinterestSeattle, Washington, United States
    A tiempo completo
    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Mostrar másÚltima actualización: hace 6 días
    Machine Learning Engineer

    Machine Learning Engineer

    IUNUSeattle, WA, US
    A tiempo completo
    Quick Apply
    At IUNU (“you knew’), we’re revolutionizing the agriculture industry through cutting-edge AI-driven solutions for greenhouse operations. Our mission is to empower growers with insi...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Senior Machine Learning Engineer (RPD - Rokt Brain)

    Senior Machine Learning Engineer (RPD - Rokt Brain)

    RoktSeattle, WA, US
    A tiempo completo
    We are Rokt, a hyper-growth ecommerce leader.Rokt is the global leader in ecommerce, unlocking real-time relevance in the moment that matters most. Rokt’s AI Brain and ecommerce Network powers...Mostrar másÚltima actualización: hace más de 30 días
    Principal / Senior Machine Learning Engineer

    Principal / Senior Machine Learning Engineer

    IntelliPro Group Inc.Seattle, WA, US
    A tiempo completo
    Quick Apply
    Principal / Senior Machine Learning Engineer Position Type : FTE Location : San Francisco / Portland , ME / Boston / Chicago / Seattle Salary Range : $ 20 0,000 - $ 270 , 000 (USD) Job ID# : ...Mostrar másÚltima actualización: hace más de 30 días
    Mojang Studios is hiring : AI / Machine Learning Engineer, Minecraft in Redmond

    Mojang Studios is hiring : AI / Machine Learning Engineer, Minecraft in Redmond

    MediabistroRedmond, WA, United States
    A tiempo completo
    At Mojang Studios, the creators of Minecraft, we are on a mission to build a better world through the power of play.Play is at the heart of our work. It informs our development philosophy and the wa...Mostrar másÚltima actualización: hace 29 días
    Aditi Consulting is hiring : UX Researcher in Redmond

    Aditi Consulting is hiring : UX Researcher in Redmond

    MediabistroRedmond, WA, United States
    A tiempo completo
    This range is provided by Aditi Consulting.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. The UX research team conducts studies across ARC rese...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    PhD Machine Learning Engineer, New Grad

    PhD Machine Learning Engineer, New Grad

    MonographSeattle, WA, United States
    A tiempo completo
    Stripe is a financial infrastructure platform for businesses.Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their reve...Mostrar másÚltima actualización: hace 25 días
    • Oferta promocionada
    Machine Learning Engineer, Siri Attention & Invocation

    Machine Learning Engineer, Siri Attention & Invocation

    Apple Inc.Seattle, WA, United States
    A tiempo completo
    Machine Learning Engineer, Siri Attention & Invocation.Seattle, Washington, United States Machine Learning and AI.You will be part of a team whose focus is on applied machine learning, on building ...Mostrar másÚltima actualización: hace más de 30 días