BearDimonR / nlp-systems-kma

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Natural Language Processing Systems course at NaUKMA

General Info and Course Description

  • Overview of traditional NLP methods, Recurrent Neural Nets, Seq-to-Seq models, attention mechanism and transformers, and Large Language Models and their practical applications with a focus on the latter.
  • 8 weeks of work with a lecture and a practice period each week. There may be an invited lecturer from Grammarly to delve deeper into some of the concepts of the topic of your interest.
  • Attendance is highly desired but not obligatory. Timely submission of home assignments as possible is also encouraged, although not forced.

Topics and Timeline

As the course is completely fresh, the topics to be covered, assignments, structure, etc. may be subject to change (in rational measures)

  • Topic 1. Word Vectors, Word Embeddings, CBOW model.
  • Topic 2. Language Models and Recurrent Neural Networks.
  • Topic 3. Attention Mechanism and Transformers.
  • Topic 4. Large Language Models and their Applications.

For each topic, there is a home assignment - usually a jupyter notebook with additional util scripts and detailed description of the task provided. The student is aimed to complete the coding tasks and express an opinion on certain problems if prompted to. The maximum number of points per assignment is 15p.

Evaluation

  • 3 H/As make it a total 60 pts of the total mark.
  • A short oral presentation (up to 15m) of the selected NLP research paper gives a max of 10 pts.
  • The final exam, or rather the alternative group project assignment (similar to a Kaggle competition) yields the remaining 30 pts.
Activity Max. Points
3 Home Assignments 60
Research Presentation 10
Final Group Project 30
Total 100

Assignment 1 (20 pts)

In this assignment, you will practice how to compute word embeddings and use them for sentiment analysis. The assignemnt can be found here.

Assignment 2 (20 pts)

In this assignment, you will practice data preprocesing, RNN training from scratch, forward- and backprop through time, and you will also generate some slick dinosaur names. The assignemnt can be found here.

Assignment 3 (20 pts)

TBD.

Research Presentation and Topics (10 pts)

You are to select a paper of interest in NLP and present a short, 10-min breakdown of it during the class. This may either be a live presentation of a recording.

Final Group Project "UNLP 2023 GEC for Ukrainian" (30 pts)

Link to the task and the instructions: https://github.com/asivokon/unlp-2023-shared-task. You are to split into teams of 3-5 people, come up with a name for the team, and select one person as a team lead (TL). Fill out the team composition here.

About


Languages

Language:Jupyter Notebook 99.1%Language:Python 0.9%