Q3.5-Format LUT-Based GELU Accelerator

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Abstract

In this paper, we present two possible variations of Q3.5 bit lookup-table based hardware accelerators for the Gaussian Error Linear Unit (GELU): the Full-LUT variant, which stores all 193 entries explicitly, and the Range-Merged LUT variant, which compacts runs into range tests, leaving 85 explicit entries plus two bypass branches. The Full-LUT variant occupies 112.495 μm2, operates at 3.33 GHz, and consumes 0.19921 mW, while the Range-Merged LUT variant occupies 146.232 μm2, operates at 5.00 GHz, and consumes 0.464255 mW. When integrated into a Vision Transformer (ViT-B/16) model for ImageNet classification, both show negligible accuracy loss compared to full-precision GELU, demonstrating the efficacy of low bit LUT approximations for high efficiency ViT inference.

Details

OriginalspracheEnglisch
TitelInternational SoC Design Conference 2025, ISOCC 2025 - Proceedings of Technical Papers
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers (IEEE)
ISBN (elektronisch)979-8-3315-8642-3
ISBN (Print)979-8-3315-8643-0
PublikationsstatusVeröffentlicht - Jan. 2026
Peer-Review-StatusJa

Publikationsreihe

ReiheInternational SoC Design Conference, ISOCC
ISSN2163-9612

Konferenz

Titel22nd International SoC Design Conference
UntertitelPioneering Heterogeneous Integration of SoCs for AI- Driven Smart Systems
KurztitelISOCC 2025
Veranstaltungsnummer22
Dauer15 - 18 Oktober 2025
OrtParadise Hotel Busan
StadtBusan
LandSüdkorea

Externe IDs

ORCID /0009-0007-8401-7852/work/211722633