xidianwang412

xidianwang412

Geek Repo

Github PK Tool:Github PK Tool

xidianwang412's starred repositories

gao

FongMi影视和tvbox配置文件,如果喜欢,请Fork自用。使用前请仔细阅读仓库说明,一旦使用将被视为你已了解。

TVBox

TVBox 网络接口,更新速度快,接口访问速度快且稳定!

albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型

UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Language:PythonLicense:Apache-2.0Stargazers:2956Issues:75Issues:263

roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese

pytorch-loss

label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful

Language:PythonLicense:MITStargazers:2149Issues:23Issues:38

BERT-NER-Pytorch

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Language:PythonLicense:MITStargazers:2051Issues:13Issues:104

pretrained-models

Open Language Pre-trained Model Zoo

Chinese-NLP-Corpus

Collections of Chinese NLP corpus

CLUEPretrainedModels

高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型

TVBox

TVBox自用源以及仓库源、直播源等

box

自用,勿宣传。

Language:JavaScriptStargazers:479Issues:10Issues:0

pCLUE

pCLUE: 1000000+多任务提示学习数据集

Language:Jupyter NotebookStargazers:459Issues:7Issues:9

Pytorch-NLU

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of spee

Language:PythonLicense:Apache-2.0Stargazers:318Issues:4Issues:12

ChineseNER

中文NER的那些事儿

MiniRBT

MiniRBT (中文小型预训练模型系列)

Language:PythonLicense:Apache-2.0Stargazers:240Issues:7Issues:4

awesome-cantonese-nlp

A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP

License:CC-BY-4.0Stargazers:82Issues:7Issues:0

encyclopediaCrawler

百度百科爬虫

Language:PythonLicense:Apache-2.0Stargazers:65Issues:2Issues:4

SpiCE-Corpus

An open-access corpus of conversational bilingual speech in Cantonese and English

Language:JavaScriptStargazers:40Issues:3Issues:0

multi_task-learning

多任务学习MMOE和PLE

electra-hongkongese

Pre-trained ELECTRA from Hong Kong data

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

openrice-senti

Scraped reviews from OpenRice for sentiment analysis. Formatted to use with BERT.

License:CC-BY-4.0Stargazers:9Issues:2Issues:0

node-lemmatizer

node-lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.

Language:JavaScriptLicense:MITStargazers:8Issues:0Issues:0

sinoparserd

A service to convert chinese languages (mandarin. cantonese. shanghainese..) into their transliterated form. to segment them etc.

Language:C++License:NOASSERTIONStargazers:7Issues:0Issues:0

bert-tokenizer

A simple tool to generate bert tokens and input features

Language:TypeScriptLicense:Apache-2.0Stargazers:7Issues:0Issues:0

cantonese-nlp-benchmark

Benchmark for Cantonese word segmentation and pos tagging

Language:PythonLicense:MITStargazers:6Issues:1Issues:0

sentence-tokenization

Simple tool for tokenizing sentences, for BERT or other NLP preprocessing.

Language:JavaScriptLicense:MITStargazers:5Issues:0Issues:0