kobikun

byeong il, ko's repositories

wiki

wiki docuements

1000

dastrie

Static Double Array Trie (DASTrie)

Language:C++NOASSERTION100

ake-datasets

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

Apache-2.0000

ConvLab

Language:PythonMIT000

CTranslate2

Fast inference engine for OpenNMT models

MIT000

dotfiles

000

extractor-wiki-data

extracting multiple-language data from wiki-data

Language:Python000

gym

gym for execising

Language:Groff000

Ivory

A Hadoop toolkit for web-scale information retrieval research

Language:Java000

kenlm

KenLM: Faster and Smaller Language Model Queries

Language:C++NOASSERTION000

kobikun

Config files for my GitHub profile.

000

korean-sentence-splitter

Split Korean text into sentences using heuristic algorithm.

Language:C++BSD-3-Clause000

MPNet

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

MIT000

NER

한국어 개체명 정의 및 표지 표준화 기술보고서와 이를 기반으로 제작된 개체명 형태소 말뭉치

000

niben

nihongo benkyo by chatbot

000

nltk

NLTK Source

Language:PythonNOASSERTION000

numpy-study

Language:Jupyter Notebook000

OpenNMT-py

Open-Source Neural Machine Translation in PyTorch http://opennmt.net/

Language:PythonNOASSERTION000

opensubtitles-parser

download, extract, parse and tokenize the opensubtitles dataset with this script

Language:PythonMIT000

sentence-transformers

Sentence Embeddings with BERT & XLNet

Apache-2.0000

simstring

SimString

Language:C++NOASSERTION000

study

study ipython

Language:Jupyter Notebook000

subtitle_chatpair

chatting pair from subtitles

Language:Python000

subword-nmt

Subword Neural Machine Translation

Language:PythonMIT000

text

Language:PythonBSD-3-Clause000

TIL

000