Hongfei Xu's repositories
955.WLB
955 不加班的公司名单 - 工作 955,work–life balance (工作与生活的平衡)
Adabelief-Optimizer
Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"
BPE-Dropout
An official implementation of "BPE-Dropout: Simple and Effective Subword Regularization" algorithm.
Chinese-Paraphrase-from-Quora
Research on the Construction and Application of Paraphrase Parallel Corpus
CLUE
中文语言理解基准测评 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
compare-mt
A tool for holistic analysis of language generations systems
convtransformer
Code for the ACL2020 paper Character-Level Translation with Self-Attention
deepmind-research
This repository contains implementations and illustrative code to accompany DeepMind publications
detecting_wsd_biases_for_nmt
Repository containing the experimental code for the publication 'Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks' (Emelin, Denis, Ivan Titov, and Rico Sennrich, EMNLP 2020).
emoji-cheat-sheet
A markdown version emoji cheat sheet
fairseq-py
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
langdetect
Port of Google's language-detection library to Python.
long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
m2scorer
MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.
macOS
macOS theme for Gnome and GTK-based desktops
Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
multiview-langrep
Bridging linguistic typology and multilingual machine translation with multi-view language representations
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
NLPCC2018_GEC
The dataset and the evaluation tool for NLPCC2018 Shared Task2--Grammatical Error Correction (GEC).
pedra
Post-editing Datasets by Rakuten (PEDRa)
PLMpapers
Must-read Papers on pre-trained language models.
prism
MT Evaluation in Many Languages via Zero-Shot Paraphrasing
RecBole
A unified, comprehensive and efficient recommendation library
TextAttack
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP
transformers_without_tears
Transformers without Tears: Improving the Normalization of Self-Attention
tx-ray
Code for paper "TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP"
workflow
Sogou framework for C++ backend development.
zero
Zero -- A neural machine translation system