bclarkson-code / Tricycle

Deep learning framework completely from scratch in python + numpy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fused operations

bclarkson-code opened this issue · comments

A lot of time is spent in MLP blocks and attention blocks. The operations in these blocks can be fused to reduce both memory usage and latency