xiao's repositories
AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
iclr2016
Python code for training all models in the ICLR paper, "Towards Universal Paraphrastic Sentence Embeddings". These models achieve strong performance on semantic similarity tasks without any training or tuning on the training data for those tasks. They also can produce features that are at least as discriminative as skip-thought vectors for semantic similarity tasks at a minimum. Moreover, this code can achieve state-of-the-art results on entailment and sentiment tasks.
is-xhuang1994
Distinguish Bots from Humans on Twitter
LM-LSTM-CRF
Empower Sequence Labeling with Task-Aware Language Model
MACROSCORE
MACROSCORE project at ISI - Micro Feature Extraction direction
mrc-for-flat-nested-ner
The code for "A Unified MRC Framework for Named Entity Recognition"
nltk_contrib
NLTK Contrib
OntoNotes-5.0-NER-BIO
A BIO formatted Named Entity Recognition data set extracted from the OntoNotes 5.0 release.
para-nmt-50m
Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"
python-wordsegment
English word segmentation, written in pure-Python, and based on a trillion-word corpus.
semi-supervised-baselines
Code for "Strong Baselines for Neural Semi-supervised Learning under Domain Shift" (Ruder & Plank, 2018 ACL)
Vanilla_NER
Vanilla Sequence Labeling w. Char-LSTM-CRF