May 31, 2022|Press Releases

Neuchips' Purpose-Built Accelerator Designed to Be Efficient Recommendation Inference Engine

SHARE

Neuchips’ Purpose-Built Accelerator Designed to Be Efficient Recommendation Inference Engine

May 31, 2022

LOS ALTOS, Calif., May 31, 2022 — NEUCHIPS is excited to announce its first ASIC, RecAccel N3000 using TSMC 7nm process, and specifically designed for accelerating deep learning recommendation models (DLRM). NEUCHIPS has partnered with industry leaders in Taiwan’s semiconductor and cloud server ecosystem and plans to deliver its RecAccel N3000 AI inference platform on Dual M.2 modules for Open Compute Platform compliant servers as well as PCIe Gen 5 cards for standard data center servers during the 2H’2022.

“In 2019, when Facebook open sourced their Deep Learning Recommendation Model and challenged the industry to deliver a balanced AI inference chip platform, we decided to pursue the challenge,” said Dr. Lin, NEUCHIPS CEO, Co-Founder of Global Unichip Corp, subsidiary of TSMC and Professor at National Tsing Hua University, Taiwan. “Our continued improvements in MLPerf DLRM benchmarking and whole-chip emulation give us confidence that our RecAccel AI hardware architecture co-designed with our software will scale to deliver industry leadership and exceed our target of 20M inferences per second at 20 Watts.”

NEUCHIPS RecAccel N3000 Inference platform includes sophisticated hardwired accelerators, patented query scheduling and a comprehensive software stack optimized to provide high accuracy and hardware utilization while maintaining energy efficiency required in data centers. Other key features include the following:

  •    Proprietary 8-bit coefficient quantization, calibration and hardware support that deliver 99.95% of FP32 accuracy.
  •    Patented embedding engine with novel cache design and DRAM traffic optimization that reduces LPDDR5 access by 50% and increases bandwidth utilization by 30%.
  •    Dedicated MLP compute engines that deliver state-of-the-art energy efficiency at engine level, and 1 microjoule per inference at SOC level.
  •    Proven software stack that delivers very high scalability across multiple cards.
  •    Support for leading recommender AI models including DLRM, WND, DCN, and NCF.
  •    Robust security based on hardware root of trust.

Get the latest NEUCHIPS news by email.