A Functional Approach to Synthesizing Routable Programmable Accelerators for Neural Networks


Producing optimized accelerators is tedious, as even modern HDLs (Hardware Description Languages) such as Chisel, require reasoning about low-level concepts. Recent functional approaches, such as Aetherling and SHIR, treat hardware as composition of pure operators. This raises the abstraction level, allowing for systematic optimizations through rewriterules for FPGAs (Field Programmable Gate Arrays).
These approaches have so far been limited to small, fixed-function accelerators. Recent work maps neural networks to FPGAs by sharing coarse-grained functions via the Let construct. However, as the number of call sites or parallelism increases, synthesis fails due to increased routing congestion.
These limitations are addressed with a new way to express sharing in a functional IR (Intermediate Representation). By combining the Reduce and SwitchApply primitives over an instruction stream, functions become programmable, with shared control logic and a datapath, reducing routing pressure. Upper-bounded streams further enable sharing across varying input sizes. Across networks from LeNet 5 to ResNet, the resulting FPGA designs remain routable, delivering high performance with speedups between 1.1×–3.4× compared to prior work.
Tue 16 JunDisplayed time zone: Mountain Time (US & Canada) change
13:40 - 15:20 | Session 4: Specialized Hardware and Accelerator DesignLCTES at Flatirons 3 Chair(s): Jongouk Choi University of Central Florida | ||
13:40 22mTalk | Can Fine-Grain Multi-threading Subsume VLIW? LCTES Scott Pomerville Northern Michigan University, Soner Onder Michigan Technological University, Gang-Ryung Uh Florida State University, David Whalley Florida State University DOI | ||
14:02 22mTalk | Sirop: A Small IR for HLS with Parallel Patterns LCTES DOI | ||
14:24 22mTalk | A Functional Approach to Synthesizing Routable Programmable Accelerators for Neural Networks LCTES Tzung-Han Juang McGill University, Paul Teng McGill University, Canada, Christophe Dubach McGill University DOI | ||
14:46 22mTalk | LoopHint: A Compiler-Assisted Loop Branch Predictor for Embedded DSPsRemote LCTES Yuanyang Xiang Institute of Automation, Chinese Academy of Sciences, Chen Xu , xiaoruozhou Institute of Automation, Chinese Academy of Sciences, Zhiwei Zhang Institute of Automation, Chinese Academy of Sciences DOI | ||