byeong il, ko's repositories
ake-datasets
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
CTranslate2
Fast inference engine for OpenNMT models
extractor-wiki-data
extracting multiple-language data from wiki-data
gym
gym for execising
Ivory
A Hadoop toolkit for web-scale information retrieval research
kenlm
KenLM: Faster and Smaller Language Model Queries
kobikun
Config files for my GitHub profile.
korean-sentence-splitter
Split Korean text into sentences using heuristic algorithm.
MPNet
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
NER
한국어 개체명 정의 및 표지 표준화 기술보고서와 이를 기반으로 제작된 개체명 형태소 말뭉치
niben
nihongo benkyo by chatbot
nltk
NLTK Source
numpy-study
numpy-study
OpenNMT-py
Open-Source Neural Machine Translation in PyTorch http://opennmt.net/
opensubtitles-parser
download, extract, parse and tokenize the opensubtitles dataset with this script
sentence-transformers
Sentence Embeddings with BERT & XLNet
simstring
SimString
study
study ipython
subtitle_chatpair
chatting pair from subtitles
subword-nmt
Subword Neural Machine Translation
TIL
TIL