oli-clive-griffin / learnable-skip

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learnable Skip

Just some quick experimentation. I had an idea while reading highway networks that it might be nice to allow models with skip connections the comptelety "turn off" the skip connection. With current architectures, if a layer wants to override the residual stream. It has to learn to produce both the new features, and the inverse of the existing residual stream. By adding a learned (learned during training, static at inference time) gate parameter, it enables layers to learn to completely override the residual stream.

About


Languages

Language:Python 100.0%