So Miyagawa's repositories
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
tagger-part-of-speech
Part of speech tagger for Sahidic Coptic
coptic-translator-frontend
A frontend for a Coptic machine translation utility.
coptic-translator-backend
Utilities and notebook code used to parse data and create Coptic machine translation models.
annotorious-v2-selector-pack
Additional selection tools for Annotorious and the Annotorious OpenSeadragon plugin
KirishitanLigaturesFont
Font for displaying abbreviated ligatures found in Kirishitan Ban prints.
doccano
Open source annotation tool for machine learning practitioners.
lingrex
Linguistic Reconstruction with LingPy
camel
🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society
Prompt-Engineering-Guide-Japanese
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
LAREX
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
ndlkotenocr_cli
NDL古典籍OCRのアプリケーション
ndlocr_cli
NDLOCRのアプリケーション
ndlngramdata
デジタル化資料から作成したOCRテキストデータのngram頻度統計情報のデータセット
layout-dataset
NDL-DocLデータセット(資料画像レイアウトデータセット)
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM