Quadric has developed an innovative general purpose neural processing unit (GPNPU) architecture. Our co-optimized software and hardware are designed to run neural network (NN) inference workloads across various edge and endpoint devices, from battery-operated smart sensors to high-performance automotive and autonomous vehicle systems. Unlike other NPUs or neural network accelerators that can only accelerate parts of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
Our Values :
Integrity, Humility, Happiness
Our Expectations :
Initiative, Collaboration, Completion
Role :
You will join the data science team focused on model optimization. Your work will involve researching, prototyping, and validating low-precision techniques to make neural networks more efficient on the ChimeraGPNPU. Your analyses will influence the quantization recipes in the ChimeraSDK and guide future hardware features.
Responsibilities :
- Design rigorous experiments to compare PTQ, QAT, pruning, and mixed-precision schemes on vision, language, and multimodal models.
- Build calibration datasets and develop Python notebooks or dashboards to track accuracy, latency, power, and memory trade-offs.
- Perform layer- and token-level error analysis to inform numerical format choices.
- Collaborate with the compiler team to translate findings into SDK flows and reference configurations.
- Publish internal whitepapers, external benchmarks, and present results at industry events and to customers.
- Stay updated with academic literature on compression and efficient inference; translate promising ideas into reproducible prototypes.
Qualifications :
M.S. / Ph.D. in CS, EE, Applied Math, or related fields, with 5+ years in ML model optimization or data-driven research.Deep understanding of fixed-point arithmetic, quantization theory, and statistical calibration.Proficiency in Python, PyTorch or TensorFlow, NumPy / Pandas / SciPy, and data visualization tools like Matplotlib or Plotly.Hands-on experience with at least one quantization toolkit (e.g., PyTorchFX / PTQ / QAT, TF-Lite, ONNX-Runtime, TVM, MLIRQuant).Knowledge of CNNs, Transformers, and DNN architectures.Benefits :
Competitive salaries and meaningful equityHealth care plans (Medical, Dental, Vision)Retirement plans (401k, IRA)Life insurance (Basic, Voluntary, AD&D)Paid time off (Vacation, Sick, Holidays)Family leave (Maternity, Paternity)Work from home optionsFree food and snacksFounded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer for real-time edge device needs. Our goal is to empower developers across industries to create innovative technologies today. The company was co-founded by technologists from MIT and Carnegie Mellon, previously involved in Bitcoin computing at 21.
Quadric is an equal opportunity and affirmative action employer. We are committed to diversity and inclusion, regardless of race, religion, sex, national origin, sexual orientation, age, citizenship, marital status, or disability.#J-18808-Ljbffr