Tao Wang (twang2218)

twang2218

Geek Repo

Location:Sydney, Australia

Github PK Tool:Github PK Tool


Organizations
bayleeadamoss

Tao Wang's starred repositories

Stargazers:210Issues:0Issues:0

CMMLU

CMMLU: Measuring massive multitask language understanding in Chinese

Language:PythonStargazers:633Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12807Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4430Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:465Issues:0Issues:0

text-classification-cn

中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法

Language:PythonStargazers:144Issues:0Issues:0

OpenCLaP

Open Chinese Language Pre-trained Model Zoo

License:MITStargazers:976Issues:0Issues:0

Awesome-Chinese-NLP

A curated list of resources for Chinese NLP 中文自然语言处理相关资料

License:Apache-2.0Stargazers:7749Issues:0Issues:0

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

License:MITStargazers:9316Issues:0Issues:0

glyph

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?

Language:ShellLicense:BSD-3-ClauseStargazers:172Issues:0Issues:0

WanJuan1.0

万卷1.0多模态语料

License:CC-BY-4.0Stargazers:448Issues:0Issues:0

YAYI

雅意大模型:为客户打造安全可靠的专属大模型,基于大规模中英文多领域指令数据训练的 LlaMA 2 & BLOOM 系列模型,由中科闻歌算法团队研发。(Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM)

Language:PythonLicense:Apache-2.0Stargazers:3245Issues:0Issues:0

All_Dictionaries

宇宙最全在线词典网站导航

Stargazers:1584Issues:0Issues:0

MiNLP

XiaoMi Natural Language Processing Toolkits

Language:ScalaLicense:Apache-2.0Stargazers:779Issues:0Issues:0
Stargazers:422Issues:0Issues:0
License:Apache-2.0Stargazers:18Issues:0Issues:0

sikufenci

一个面向繁体中文古籍分词的python工具包

Language:PythonStargazers:31Issues:0Issues:0

SikuBERT-for-digital-humanities-and-classical-Chinese-information-processing

SikuBERT:四库全书的预训练语言模型(四库BERT) Pre-training Model of Siku Quanshu

License:Apache-2.0Stargazers:108Issues:0Issues:0

Zhongjing

A Chinese medical ChatGPT based on LLaMa, training from large-scale pretrain corpus and multi-turn dialogue dataset.

Language:PythonLicense:Apache-2.0Stargazers:276Issues:0Issues:0
Language:JavaScriptLicense:GPL-3.0Stargazers:69Issues:0Issues:0

shu

中文书籍收录整理, Collection of Chinese Books

Language:PythonLicense:MITStargazers:161Issues:0Issues:0

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

License:MITStargazers:3257Issues:0Issues:0

openclas

Automatically exported from code.google.com/p/openclas

Language:C++License:NOASSERTIONStargazers:1Issues:0Issues:0

Chinese-LlaMA2

Repo for adapting Meta LlaMA2 in Chinese! META最新发布的LlaMA2的汉化版! (完全开源可商用)

Language:PythonStargazers:748Issues:0Issues:0

Llama-Chinese

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

Language:PythonStargazers:13105Issues:0Issues:0

Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。

Language:PythonLicense:Apache-2.0Stargazers:3969Issues:0Issues:0

Awesome-Chinese-LLM

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

Stargazers:13705Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:62528Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language:PythonStargazers:9693Issues:0Issues:0

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:14527Issues:0Issues:0