Talent.com
AGI Sr Inference Software Development Engineering, AGI Inference
AGI Sr Inference Software Development Engineering, AGI InferenceAmazon • Sunnyvale, CA, United States
AGI Sr Inference Software Development Engineering, AGI Inference

AGI Sr Inference Software Development Engineering, AGI Inference

Amazon • Sunnyvale, CA, United States
30+ days ago
Job type
  • Full-time
Job description

The Sensory Inference team at AGI is a group of innovative developers working on ground-breaking multi-modal inference solutions that revolutionize how AI systems perceive and interact with the world. We push the limits of inference performance to provide the best possible experience for our users across a wide range of applications and devices. We are looking for talented, passionate, and dedicated Inference Engineers to join our team and build innovative, mission-critical, high-volume production systems that will shape the future of AI. You will have an enormous opportunity to make an impact on the design, architecture, and implementation of novel technologies used every day, potentially by people you know. This role offers the exciting chance to work in a highly technical domain at the boundary between fundamental AI research and production engineering such as Quantization, Speculative Decoding, and Long Context for inference efficiency.

Key job responsibilities

  • Develop high-performance inference software for a diverse set of neural models, typically in C / C++
  • Design, prototype, and evaluate new inference engines and optimization techniques
  • Participate in deep-dive analysis and profiling of production code
  • Optimize inference performance across various platforms (on-device, cloud-based CPU, GPU, proprietary ASICs)
  • Collaborate closely with research scientists to bring next-generation neural models to life
  • Partner with internal and external hardware teams to maximize platform utilization
  • Work in an Agile environment to deliver high-quality software against tight schedules
  • Hold a high bar for technical excellence within the team and across the organization

BASIC QUALIFICATIONS

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree in Computer Science, Computer Engineering, or related field
  • Strong C / C++ programming skills
  • Solid understanding of deep learning architectures (CNNs, RNNs, Transformers, etc.)
  • PREFERRED QUALIFICATIONS

  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience with inference frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, LLaMA.cpp, etc.
  • Proficiency in performance optimization for CPU, GPU, or AI hardware
  • Proficiency in kernel programming for accelerated hardware using programming models such as (but not limited to) CUDA, OpenMP, OpenCL, Vulkan, and Metal
  • Experience with latency-sensitive optimizations and real-time inference
  • Understanding of resource constraints on mobile / edge hardware
  • Knowledge of model compression techniques (quantization, pruning, distillation, etc.)
  • Experience with LLM efficiency techniques like speculative decoding and long context
  • Strong communication skills and ability to work in a collaborative environment
  • Passion for solving complex problems and driving innovation in AI technology
  • Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

    Los Angeles County applicants : Job duties for this position include : work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

    Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country / region you're applying in isn't listed, please contact your Recruiting Partner.

    Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300 / year in our lowest geographic market up to $261,500 / year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and / or other benefits. For more information, please visit This position will remain posted until filled. Applicants should apply via our internal or external career site.

    Create a job alert for this search

    Software Engineering • Sunnyvale, CA, United States

    Related jobs
    Software Engineer - Applied Inference

    Software Engineer - Applied Inference

    Xai • Palo Alto, CA, United States
    Full-time
    AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show more
    Last updated: 15 hours ago • Promoted • New!
    Senior Software Engineer II, Agentic AI Platform

    Senior Software Engineer II, Agentic AI Platform

    Moveworks.ai • Mountain View, CA, United States
    Full-time
    Are you up for an exciting challenge? Picture yourself scaling and optimizing a cutting-edge Generative AI product that offers instant assistance to enterprise users. Ever wondered how to apply abst...Show more
    Last updated: 30+ days ago • Promoted
    Inference Software Engineer

    Inference Software Engineer

    Etched • San Jose, California, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Developer Advocate, Databricks AI Agentic Systems

    Sr. Developer Advocate, Databricks AI Agentic Systems

    Menlo Ventures • San Francisco, CA, United States
    Full-time
    San Francisco, Bellevue, Amsterdam.Are you a recognized technical leader in Generative AI and MLOps, driven to define the future of production AI Agentic Systems? This Senior Developer Advocate rol...Show more
    Last updated: 1 day ago • Promoted
    Senior Inference Software Engineer

    Senior Inference Software Engineer

    Etched • San Jose, CA, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 5 days ago • Promoted
    Sr. Software Engineer- Agentic AI & Digital Experience

    Sr. Software Engineer- Agentic AI & Digital Experience

    Zscaler • San Jose, CA, United States
    Full-time
    Zscaler accelerates digital transformation so our customers can be more agile, efficient, resilient, and secure.Our cloud native Zero Trust Exchange platform protects thousands of customers from cy...Show more
    Last updated: 15 hours ago • Promoted • New!
    Senior Software Engineer, AI Inference Platform

    Senior Software Engineer, AI Inference Platform

    CEREBRAS SYSTEMS INC. • Sunnyvale, CA, United States
    Full-time
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more
    Last updated: 14 hours ago • Promoted • New!
    Sr SDE, AGI Inference- GenAI

    Sr SDE, AGI Inference- GenAI

    Amazon • Sunnyvale, CA, United States
    Full-time
    The Sensory Inference team at AGI is a group of innovative developers working on ground-breaking multi-modal inference solutions that revolutionize how AI systems perceive and interact with the wor...Show more
    Last updated: 15 hours ago • Promoted • New!
    Sr. Staff Software Engineer, AI Infra

    Sr. Staff Software Engineer, AI Infra

    Linkedin • Mountain View, California, United States
    Full-time
    LinkedIn is the worlds largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover excit...Show more
    Last updated: 30+ days ago • Promoted
    Sr Software Engineer - AI

    Sr Software Engineer - AI

    The Trade Desk • San Francisco, CA, United States
    Full-time
    The Trade Desk is a global technology company with a mission to create a better, more open internet for everyone through principled, intelligent advertising. Handling over 1 trillion queries per day...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Inference

    Software Engineer, Inference

    Trypulse • San Francisco, CA, United States
    Full-time
    Pulse is tackling one of the most persistent challenges in data infrastructure : extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to docume...Show more
    Last updated: 30+ days ago • Promoted
    Sr Software Engineer (Prisma Access SASE)

    Sr Software Engineer (Prisma Access SASE)

    Palo Alto Networks • Santa Clara, CA, United States
    Full-time
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Software Integration Engineer

    Sr. Software Integration Engineer

    Reliable Robotics • Mountain View, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Inference Platform

    Senior Software Engineer, Inference Platform

    MongoDB • Palo Alto, CA, United States
    Full-time
    MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and...Show more
    Last updated: 30+ days ago • Promoted
    AIML - Senior Software Engineer, Siri & Information Intelligence

    AIML - Senior Software Engineer, Siri & Information Intelligence

    Apple • Santa Clara, CA, United States
    Full-time
    The Siri & Information Intelligence team is looking for engineers to define and architect our on-device search technologies. As part of this group, you'll work with our machine learning, natural lan...Show more
    Last updated: 15 hours ago • Promoted • New!
    Senior Software Engineer - Intelligence

    Senior Software Engineer - Intelligence

    Hard Yaka • San Francisco, CA, United States
    Full-time
    We exist to accelerate innovation.We do this by giving more people the opportunity to participate in the venture economy by building the financial infrastructure that makes it possible for more peo...Show more
    Last updated: 4 days ago • Promoted
    Sr Staff Engineer Software (AI Ops)

    Sr Staff Engineer Software (AI Ops)

    Palo Alto Networks • Santa Clara, California, United States
    Full-time
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show more
    Last updated: 30+ days ago • Promoted
    Inference Software Engineer

    Inference Software Engineer

    ETCHED LLC • Cupertino, CA, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 30+ days ago • Promoted