ganbon's starred repositories
project-dialogism-novel-corpus
The official repository for the The Project Dialogism Novel Corpus, a dataset of annotated quotations in full-length English novels.
real-persona-chat
RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities
awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.
wikiextractor
A tool for extracting plain text from Wikipedia dumps
awesome-japanese-llm
日本語LLMまとめ - Overview of Japanese LLMs
mattermost-docker
Deprecated
whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Jperchat_to_perchat
NTT様の日本語版ペルソナチャットから本家ペルソナチャットのフォーマットに変換するプログラム
bert-classification-tutorial
【2023年版】BERTによるテキスト分類
openai-cookbook
Examples and guides for using the OpenAI API
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
japanese-dialog-transformers
Code for evaluating Japanese pretrained models provided by NTT Ltd.
mecab-ipadic-neologd
Neologism dictionary based on the language resources on the Web for mecab-ipadic
ner-wikipedia-dataset
Wikipediaを用いた日本語の固有表現抽出データセット