wanburana / NLP

This is the repository for the course Natural Language Processing at Asian Institute of Technology. Mostly covering theoretical aspects of NLP and some coding assignments using PyTorch

Home Page:http://chaklam.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NLP - Natural Language Processing

This is the repository for the Natural Language Processing at Asian Institute of Technology (AIT).

Google slide lectures can be found in: https://drive.google.com/drive/folders/14x9_-Y_aWysPIZFLaVrZE2ngy2h_beJj?usp=sharing

For the solution of the assignments, please contact my TA information below. I have excluded from this repository for learning purposes for my AIT students.

I would also like to give huge credits to several githubs/web resources that I have revised to create this:

I would also like to thank students who have contributed:

If you have any questions regarding the lectures/assignments, do not hesitate to contact them in their office hours.

Prerequisites

This course is best taken after the following courses offered in the August semester:

Course Outline

The course will be delivered in 13 weeks, 2 lectures per week, with each lecture spanning 1.5h.

Part I: Fundamentals

  1. Word Vectors - Word2vec (A1 starts)
  2. Word Vectors - GloVe
  3. Neural Networks and Backprops Review (A2 starts, A1 due)
  4. Dependency Parsing
  5. Constituency Parsing (A3 starts, A2 due)

Part II: Model Architectures

  1. Language Models and Recurrent Neural Network
  2. LSTM and GRU
  3. Machine Translation, Attention (A4 starts, A3 due)
  4. Transformer
  5. Pretrained Models - BERT, GPT, T5
  6. Word Vectors - FastText, ElMo (A5 starts, A4 due)

Part III: NLP Tasks and Evaluations

  1. Natural Language Generation
  2. Question-Answering (A6 starts, A5 due)
  3. Analysis of Model's Inner Workings
  4. Project Tips and Ideas (by TAs) (A6 due; project starts)
  5. Design Workshops Part 1 (by TAs)
  6. Design Workshops Part 2 (by TAs)
  7. Knowledge Integration
  8. Coreference Resolution

Part IV: Future of NLP

  1. Recent NLP Trend I (by TAs)
  2. Recent NLP Trend II (by TAs)
  3. Recent NLP Trend III (by TAs)

Part V: Project

  1. Project Progress Presentation
  2. Project Day
  3. Project Day
  4. Final Project Presentation

Grade Criteria

The course has the following grade criteria:

  1. Assignment (40%) -->
    • There will be a total of 6 coding assignments
    • Any late work (indicated by Google Classroom) will be deducted 50%. NO excuses will be accepted.
    • We are extremely serious about copying and plagiarism. This assignment is intended for you to learn. TA has the privilege to give zeros or partial score to any sort of plagiarism or alike. Their call IS FINAL.
      • A1: Getting Started (5%)
      • A2: Word2Vec (7%)
      • A3: Dependency Parsing (7%)
      • A4: Bidirectional LSTM with Attention for Classification from Scratch (7%)
      • A5: Transformers for Seq2Seq from Scratch (7%)
      • A6: Pretraining BERT + finetuning (7%)
  2. Final Project (45%)
    • Three default project topics will be given as follows:
      • (1) Text Summarization (Pranissa)
      • (2) Pretraining + Fineturning on Intent Classification (Sitiporn)
      • (3) Social Media Depression (Chanapa)
    • Main criteria focuses on learning, in particular
      • (1) Novelty (related work) (20%)
      • (2) Experiment rigour (comparisons) (20%)
      • (3) Model complexity (competency) (20%)
      • (4) Evaluation methods (appropriate) (20%)
      • (5) Effort (not last day!) (20%)
    • Submission deliverables:
      • (1) Python file (e.g., notebook, .py)
      • (2) Presentation file (e.g., .pdf, .ppt)
      • (3) Dataset
  3. Flipped Classroom Quiz (15%)
    • Containing few MC questions regarding this coming week lecture (usually two lecture slides) - starting from the second lecture!
    • Any late work (indicated by Google Classroom) will be deducted 30%. NO excuses will be accepted.

About

This is the repository for the course Natural Language Processing at Asian Institute of Technology. Mostly covering theoretical aspects of NLP and some coding assignments using PyTorch

http://chaklam.com


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%