cplim's repositories
ACL2019-ODEE
This is the code for our ACL 2019 paper "Open Domain Event Extraction Using Neural Latent Variable Models"
burst_detection
Detect bursts in batched data using Kleinberg's (2002) algorithm.
docker-spark-cluster
A simple spark standalone cluster for your testing environment purposses
embeddings
Fast, DB Backed pretrained word embeddings for natural language processing.
es-dedupe
Tool for removing duplicate documents from Elasticsearch
gdeltPyR
Python based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.
Information-Extraction-Chinese
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
janome
Japanese morphological analysis engine written in pure Python
LASER
Language-Agnostic SEntence Representations
lime
Lime: Explaining the predictions of any machine learning classifier
mecab-python3
:snake: mecab-python. you can find original version here:http://taku910.github.io/mecab/
news-graph
Key information extraction from text and graph visualization
oseti
Dictionary based Sentiment Analysis for Japanese
pytorch_geometric
Geometric Deep Learning Extension Library for PyTorch
pytorchic-bert
Pytorch Implementation of Google BERT
RRWEL
Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning. Code from IJCAI 2019 paper.
sengiri
Yet another sentence-level tokenizer for the Japanese text
snorkel
A system for quickly generating training data with weak supervision
The-Elements-of-Statistical-Learning-Python-Notebooks
A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book
Thespian
Python Actor concurrency library
transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
VnCoreNLP
A Vietnamese natural language processing toolkit (NAACL 2018)
wiki-dump-reader
Extract corpora from Wikipedia dumps