actions-on-google / actionssdk-shiritori-ja-nodejs

しりとり AoG サンプルゲーム

Home Page:https://assistant.google.com/services/a/uid/00000064f48a4a82

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

provide jupyter notebook for corpus generation

proppy opened this issue · comments

We should provide a notebook to document the corpus generation and allow developer to easily create alternative corpus.

This should cover:

  • word extraction and filtering
  • indexing
  • text embeddeding model generation
  • upload to database.

Interesting alternative corpus are available on https://www.ninjal.ac.jp/english/database/.

In particular http://chaki-data.ninjal.ac.jp/momotaro/momotaro-2015-11-10/ looks very interesting :)