Zae Myung Kim's repositories

sentsplit

A flexible sentence segmentation library using CRF model and regex rules

Language:PythonLicense:MITStargazers:22Issues:4Issues:12

streamlit-tutorial

A simple tutorial script on Streamlit using the Iris Dataset

Language:PythonLicense:MITStargazers:12Issues:2Issues:0

Visualizing-Cross-Lingual-Discourse-Relations

Codes for paper, "Visualizing Cross-Lingual Discourse Relations in Multilingual TED Corpora" at CODI 2021 @ EMNLP 2021

crawl-naver-news-and-comments

Crawling the most read news articles per day over the years (with comments)

Language:PythonStargazers:1Issues:4Issues:0
Language:ShellStargazers:1Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:2Issues:2

bertviz

Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)

License:Apache-2.0Stargazers:0Issues:0Issues:0

ContraPro

Contrastive evaluation of pronoun translation in neural machine translation

Language:PerlLicense:MITStargazers:0Issues:1Issues:0

Cornell-Conversational-Analysis-Toolkit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

Creative-Commons-Markdown

Markdown-formatted Creative Commons licenses

Stargazers:0Issues:1Issues:0
Stargazers:0Issues:1Issues:0

Discourse-Phenomena-in-Document-level-Neural-Machine-Translation

Datasets for "A Test Suite for Evaluating Discourse Phenomena in Document-level Neural Machine Translation" accepted by Proceedings of the Second International Workshop of Discourse Processing

Stargazers:0Issues:1Issues:0

DMRST_Parser

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Language:PythonStargazers:0Issues:0Issues:0
Language:DockerfileStargazers:0Issues:2Issues:1

good-translation-wrong-in-context

This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 paper "Context-Aware Monolingual Repair for Neural Machine Translation"

Language:RubyStargazers:0Issues:1Issues:0

google-research

Google Research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:1Issues:0

kmeans_pytorch

kmeans using PyTorch

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

korean_wordlist

korean wordlist

Language:PythonStargazers:0Issues:1Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:PLSQLLicense:MITStargazers:0Issues:1Issues:0

mtdlc

Library for parsing document-level corpora for machine translation

License:Apache-2.0Stargazers:0Issues:2Issues:0

Pytorch-Sequence-Bucket-Iterator

A minimal sampler example for bucketing sequences of similar lengths in Pytorch based off of @TrentBrick script https://gist.github.com/TrentBrick/bac21af244e7c772dc8651ab9c58328c.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

Shallow-Discourse-Annotation-for-Chinese-TED-Talks

Datasets for "Shallow Discourse Annotation for Chinese TED Talks" Accepted by LREC 2020

Stargazers:0Issues:0Issues:0

st-annotated-text

A simple component to display annotated text in Streamlit apps.

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0

transformer-lm

Transformer language model (GPT-2) with sentencepiece tokenizer

Language:PythonStargazers:0Issues:1Issues:0

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

utils

simple scripts that make life easier...

Language:ShellStargazers:0Issues:2Issues:0

weightedWWL

learning subtree pattern importance for WL based graph kernels

Language:PythonStargazers:0Issues:0Issues:0

zaemyung.github.io

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0