We are looking for a Machine Learning Runtime Optimization Engineer to work on an innovative project that redefines software. In this role, you will focus on backend optimizations for ML runtimes, including hardware acceleration, and inference speed improvements.
Experience with ML inference engines (ONNX Runtime, TensorRT, CoreML, etc.) and optimizing models for deployment.
Proficiency in Mac / Linux-based runtimes and experience with heterogeneous compute environments (CPU / GPU / NPUs).
Deep understanding of numerical optimization, compiler techniques, and low-level performance tuning.
Open to new graduates with a PhD in optimization, systems, machine learning, or related fields.
Machine Learning Engineer • Remote, Remote, United States