Papers Voting

Question

Papers Voting

hadyelsahar opened this issue 4 years ago · comments

Hady Elsahar commented 4 years ago

In this issue you can either:

Add papers that you think are interesting to read and discuss (please stick to the format).
vote: should be done using 👍 on comments

Hady Elsahar · Answer 1 · Fri May 01 2020 23:11:12 GMT+0800 (China Standard Time)

Reformer: The Efficient Transformer
https://arxiv.org/abs/2001.04451

Summary:

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers. For one, we replace dot-product attention by one that uses locality-sensitive hashing, changing its complexity from O(L2) to O(LlogL), where L is the length of the sequence. Furthermore, we use reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of N times, where N is the number of layers. The resulting model, the Reformer, performs on par with Transformer models while being much more memory-efficient and much faster on long sequences.

Hady Elsahar · Answer 2 · Fri May 01 2020 23:11:31 GMT+0800 (China Standard Time)

Unsupervised Question Decomposition for Question Answering
https://arxiv.org/abs/2002.09758
Twitter thread: https://twitter.com/EthanJPerez/status/1232127027961942018

"We aim to improve question answering (QA) by decomposing hard questions into easier sub-questions that existing QA systems can answer. Since collecting labeled decompositions is cumbersome, we propose an unsupervised approach to produce sub-questions."