WangZeJun (zejunwang1)

zejunwang1

Geek Repo

Company:Tsinghua University

Location:北京海淀

Github PK Tool:Github PK Tool

WangZeJun's repositories

CSTS

中文自然语言推理与语义相似度数据集

LLMTuner

大语言模型指令调优工具(支持 FlashAttention)

Language:PythonLicense:Apache-2.0Stargazers:166Issues:4Issues:12

bert4vec

一个基于预训练的句向量生成工具

Language:PythonLicense:Apache-2.0Stargazers:132Issues:2Issues:5

bert_text_classification

基于 BERT 模型的中文文本分类工具

chatglm_tuning

基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调

Language:PythonLicense:Apache-2.0Stargazers:54Issues:2Issues:4

bertorch

基于 pytorch 的 bert 实现和下游任务微调

Language:PythonLicense:Apache-2.0Stargazers:47Issues:3Issues:2

easytokenizer

高性能文本 Tokenizer 库

Language:CLicense:MITStargazers:26Issues:2Issues:4

bloom_tuning

BLOOM 模型的指令微调

Language:PythonLicense:Apache-2.0Stargazers:24Issues:1Issues:6

gpt2classifier

基于中文 GPT2 预训练模型的文本分类微调

Language:PythonLicense:Apache-2.0Stargazers:20Issues:1Issues:0

darmatch

一个非常高效的字符串匹配工具,支持正向/反向最大匹配分词和多模式字符串精确匹配

Language:C++License:MITStargazers:17Issues:3Issues:1

fastMatch

Large-scale exact string matching tool

Language:C++License:MITStargazers:15Issues:2Issues:0

lightltp

基于 onnxruntime 推理引擎的中文 ltp 词法分析

gpt2ppl-zh

基于中文 GPT2 预训练模型的语句困惑度计算

Language:PythonLicense:MITStargazers:11Issues:1Issues:2

simbert_distill

Two-stage SimBERT distillation

Language:PythonLicense:Apache-2.0Stargazers:8Issues:2Issues:1

fastlcs

An effective tool for solving LCS problems

Language:C++License:MITStargazers:7Issues:1Issues:0

CTCDataset

中文纠错数据集汇总

License:Apache-2.0Stargazers:4Issues:1Issues:0

ElectraForSpellingCheck

基于 Electra 预训练模型的中文拼写检查

Language:PythonLicense:Apache-2.0Stargazers:3Issues:1Issues:1

stringutils

A bridge between Unicode encoded strings and std::string.

Language:C++License:MITStargazers:2Issues:2Issues:0

asio

Asio C++ Library

Language:C++License:NOASSERTIONStargazers:1Issues:0Issues:0

Firefly

Firefly(流萤): 中文对话式大语言模型

Language:PythonStargazers:1Issues:0Issues:0

learn-unicode

Learn Unicode the easy way

License:MITStargazers:1Issues:1Issues:0
Language:C++Stargazers:1Issues:2Issues:0

sentsplit

中文分句

Language:PythonLicense:MITStargazers:1Issues:1Issues:0

system_info

Get hardware information, header-only

Language:CLicense:MITStargazers:1Issues:1Issues:0

zejunwang1

Config files for my GitHub profile.

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:0Issues:0Issues:0

learn-regex

Learn regex the easy way

License:MITStargazers:0Issues:0Issues:0

linux-command

Linux 常用命令总结

Stargazers:0Issues:0Issues:0

tiny-utf8

Unicode (UTF-8) capable std::string

Language:C++License:BSD-3-ClauseStargazers:0Issues:1Issues:0

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0