Transformer Approximations from ReLUs

cs.LG updates on arXiv.org

Jerry Yao-Chieh Hu, Mingcheng Lu, Yi-Chen Lee, Han Liu

Apr 29, 2026, 12:00 AM

arXiv:2604.24878v1 Announce Type: new Abstract: We provide a systematic recipe for translating ReLU approximation results to softmax attention mechanism. This recipe covers many common approximation targets. Importantly, it yields target-specific, economic resource bounds beyond universal approximation statements. We showcase the recipe on multiplication, reciprocal computation, and min/max primitives. These results provide new analytical tools for analyzing softmax transformer models.