[Abstract] The Annotated BERT

Question

[Abstract] The Annotated BERT

codertimo opened this issue 6 years ago · comments

Abstract

17개의 NLP task에서 새로운 SOTA를 갈아치우고, SQuAd 데이터셋에서 사람을 이긴 화제의 논문 BERT와 Transformer에 대해서 하나하나 알아갈 수 있는 교과서를 만들고자 합니다. 시각적인 모델 구조와, GIF를 기반으로한 Attention Mechanism 설명을 통해 직관적인 이해를 하고, 깔끔하게 구현된 BERT코드를 리뷰하며 practical 하게 어떻게 구현이 되었는지 알아보는 과정으로 이루어질 예정입니다.

Attention All you need 로 시작된 Transformer의 기초, 역사 그리고 활용된 논문들을 간략히 다룹니다.
The Annotated Transformer 처럼 BERT-pytorch implementation 코드를 한줄한줄을 세세하게 다루는 튜토리얼을 제공합니다.

References

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding : https://arxiv.org/abs/1810.04805
BERT-pytorch : https://github.com/codertimo/BERT-pytorch
Attention All You Need Paper : https://arxiv.org/abs/1706.03762
The Annotated Transformer : http://nlp.seas.harvard.edu/2018/04/03/attention.html
Universal Transformer : https://ai.googleblog.com/2018/08/moving-beyond-translation-with.html
The Illustrated Transformer : http://jalammar.github.io/illustrated-transformer/
How to code Transformer : https://towardsdatascience.com/how-to-code-the-transformer-in-pytorch-24db27c8f9ec