gwern / gwern.net

Site infrastructure for gwern.net. Custom Hakyll website with unique link archiving, popup UX, transclusions/collapses, dark+reader mode, bidirectional backlinks, and typography (sidenotes, dropcaps, link icons, inflation-adjustment, subscripted-citations).

Home Page:https://gwern.net/design

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add arxiv pdf - Vaswani et al 2017 - Transformers

koenrane opened this issue · comments

Proposition to add pdf/link to the following line in the article Why Tool AIs Want to be Agent AIs:

"While LSTM RNNs are the default for sequence tasks, they have occasionally been beaten by feedforward neural networks—using internal attention or “self-attention”, like the Transformer architecture (eg Vaswani et al 2017 or Al-Rfou et al 2018)"

Add this arxiv doc:
Attention Is All You Need

Vaswani is already linked in the previous paragraph so that's unnecessary.