shelleyyyyu

shelleyyyyu's starred repositories

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT1925800

PubLayNet

Language:Jupyter NotebookNOASSERTION88600

github-typo-corpus

GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors

Language:Python47800

Gramformer

A framework for detecting, highlighting and correcting grammatical errors on natural language text. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Language:PythonMIT149300

MiduCTC-competition

文本智能校对大赛(Chinese Text Correction)的baseline

Language:Python6100

reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

Language:PythonMIT1336500

MuCGEC

MuCGEC中文纠错数据集及文本纠错SOTA模型开源；Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction"

Language:PythonApache-2.047300

Paper-Writing-Tips

MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips

337600

soft-masked-bert-for-spelling-error-correction

A third-party implementation of paper《Spelling Error Correction with Soft-Masked BERT》using tensorflow==1.12.0

Language:Python2300

google-research

Google Research

Language:Jupyter NotebookApache-2.03351600

EDA_NLP_for_Chinese

An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

Language:Python133900

CTC-Report

CTC2021-中文文本纠错大赛的SOTA方案及在线演示

Apache-2.07000

Automatic-Corpus-Generation

This repository is for the paper "A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check"

Language:Python28400

Synonyms

:herb: 中文近义词：聊天机器人，智能问答工具包

Language:PythonNOASSERTION500000

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language:PythonMIT2245800

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language:PythonMIT500

GEC-Info

Repository to collect and categorize Grammatical Error Correction papers.

11200

DataAug4NLP

Collection of papers and resources for data augmentation for NLP.

82300

eda_nlp

Data augmentation for NLP, presented at EMNLP 2019

Language:Python156900

weixin_public_corpus

微信公众号语料库

56800

BERT

a simple yet complete implementation of the popular BERT model

Language:Python12400

m2scorer

MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.

Language:PythonGPL-2.014400

NLPCC2018_GEC

Data for NLPCC2018 Shared Task--Grammatical Error Correction (GEC).

800

pycorrector

pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，LLaMA等模型应用在纠错场景，开箱即用。

Language:PythonApache-2.0537400

FASPell

2019-SOTA简繁中文拼写检查工具：FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)

Language:PythonGPL-3.0119200

gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

Language:PythonApache-2.087600

TtT

code for ACL2021 paper "Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction"

Language:Python9900

RecBole

A unified, comprehensive and efficient recommendation library

Language:PythonMIT328800

joint-kg-recommender

Language:Python23900

One-shot-Relational-Learning

Code for One-shot Relational Learning for Knowledge Graphs (EMNLP18)

Language:PythonApache-2.023900