71 / study-korean

Materials for me to study Korean.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Background

This repository is the basis for my study of Korean, at least when it comes to using vocabulary.

After going through all the free TTMIK Essential Korean course and finishing the (old) Duolingo course, I no longer have fun studying Korean, and no longer make progress.

Other apps are either too expensive, only teach basics, or just don't keep me entertained. Anki is good, but is more of a chore than anything else, and when studying using sentences I memorize sentences rather than the individual words making them up.

Ideally, Lingvist would work, but:

  1. It's not available for Korean,
  2. Even if it were, this would be too expensive for me, especially since Lingvist doesn't work offline.

There's Clozemaster, but I can't see any logic in how it teaches me new words. Words in a single study session aren't related and use widely different grammar points. Unlike Lingvist, Clozemaster doesn't recognize that two synonyms can be used for a same blank.

This repository uses the idea behind Lingvist and adapts it to Korean:

  • Words are sorted by usage, and taught from the most used Korean words to the least used ones.
  • Words aren't taught by themselves, but rather as parts of larger sentences. Since a lot of sentences make up the corpus, the same word is taught through different sentences, avoiding to memorize sentences instead of words.
    • Note that this is the reason why this cannot be an Anki deck. Here different sentences teach the same word (which progresses independently from the sentences), which cannot be done in Anki.
  • Spaced repetition is used.

Additionally:

  • It's 100% free, and exclusively works offline.
    • Sorry, no syncing for now.
  • Similarly, the data used to make the app is available, and the notebook used to generate that data is also available.
  • It's entirely generated automatically. The only "curation" is done by choosing a limited amount of examples extracted from this excellent Anki deck.

Data

All data is produced by the notebook using the specified files:

  • data/한국어 학습용 어휘 목록.xls: most common Korean words (source)
  • data/kodict.zip: Korean words with definitions, translations, examples and more (source) licensed under the CC BY-SA 2.0 KR
  • data/Korean_Grammar_Sentences_by_Evita.apkg: self-explanatory (source)

data/kodata.rawproto is produced by the notebook and its schema is available in data/data.proto. Tokenization is performed using the Korean morphological analyzer.

The notebook was also saved to data/notebook.

Web dictionary

Visit https://korean.gregoirege.is to view a dictionary built with this data with offline support (but be aware that the first load will be slow!).

Disclaimer

This data was produced by me, for free, using other free resources. I don't make guarantees regarding its quality, and more than anything, authors of the works cited above are in no way involved in this project. Please don't use the data in this repository for commercial purposes without asking them for their permission first.

About

Materials for me to study Korean.


Languages

Language:JavaScript 61.2%Language:Svelte 23.2%Language:TypeScript 15.1%Language:Nix 0.2%Language:HTML 0.2%