Language Media Processing Lab, Kyoto University's repositories
KyotoCorpus
Kyoto University Text Corpus
AnnotatedFKCCorpus
Annotated Fuman Kaitori Center Corpus
text-cleaning
A powerful text cleaner for Japanese web texts
kyoto-reader
A processor for KyotoCorpus, KWDLC, and AnnotatedFKCCorpus
KyotoCorpusAnnotationTool
An annotation tool for the Kyoto University Corpus
latent_language_of_multilingual_model
Partial code for the arXiv paper 'Beyond English-Centric LLMs: What Language Do Multilingual Language Models Think in?'
dockerfile-jumanpp-knp
Dockerfiles for Juman++, KNP, and KWJA
Abstractive-Multi-Video-Captioning
The implementation of the paper "Abstractive Multi-Video Captioning: Benchmark Dataset Construction and Extensive Evaluation."
ARKitSceneRefer
ARKitSceneRefer: Text-based Localization of Small Objects in Diverse Real-World 3D Indoor Scenes (EMNLP 2023 Findings)
Evaluate-Alignment-HVSB
Source code of the paper: Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs