TonyUSTC

followers

0

following

stars

baidu

beijing

TonyUSTC's starred repositories

leetcode

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer（第 2 版）》、《程序员面试金典（第 6 版）》题解

Language:JavaCC-BY-SA-4.029829 307 43

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.022407 202 3364

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:PythonMIT21981 502 2432

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonMIT9739 66 103

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION9338 157 590

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.07887 75 280

Synonyms

:herb: 中文近义词：聊天机器人，智能问答工具包

Language:PythonNOASSERTION4995 174 128

stopwords

中文常用停用词表（哈工大停用词表、百度停用词表等）

text2vec

text2vec, text to vector. 文本向量表征工具，把文本转化为向量矩阵，实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型，开箱即用。

Language:PythonApache-2.04252 30 146

CLUE

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

Language:Python3907 89 99

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.01834 21 82

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonBSD-3-Clause1447 11 139

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonApache-2.0915 18 81

BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Language:PythonNOASSERTION811 14 27

Final_word_Similarity

综合了同义词词林扩展版与知网（Hownet）的词语相似度计算方法，词汇覆盖更多、结果更准确。

Language:PythonMIT713 15 10

Advances-in-Label-Noise-Learning

A curated (most recent) list of resources for Learning with Noisy Labels

Megatron-LLaMA

Best practice for training LLaMA models in Megatron-LM

Language:PythonNOASSERTION571 6 58

Awesome-Noisy-Labels

A Survey

DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Language:PythonMIT521 9 53

OmniEvent

A comprehensive, unified and modular event extraction toolkit.

Language:PythonMIT331 10 25

HiAGM

Hierarchy-Aware Global Model for Hierarchical Text Classification

Language:PythonMIT202 8 16

SCELoss-Reproduce

Reproduce Results for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels" https://arxiv.org/abs/1908.06112

Language:Python178 5 8

contrastive-htc

This repository implements a contrastive learning model for hierarchical text classification. This work has been accepted as the long paper "Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification" in ACL 2022.

Language:PythonMIT122 2 32

NLPDataAugmentation

Chinese NLP Data Augmentation， BERT Contextual Augmentation

Language:Python109 7 5

NLP-Data-Augmentation

NLP文本增强的两种方式：同义词替换（利用word2vec词表）和回译

Language:Python66 2 2

lnl_sr

Learning with Noisy Labels via Sparse Regularization, ICCV2021

Language:Python46 2 4

hierarchical-multi-label-text-classification

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

Language:Python36 3 8

replaceSynbycilin

利用哈工大同义词林替换问答文本内的同义词进行语料扩充

Language:PythonMIT3501

NLP_Chinese_data_Augment

中文数据增强封装类：同义词替换、随机插入、随机交换、随机删除

Language:PythonApache-2.04 20

RANSAC

⭐ RANSAC is an algorithm used for fiting models with posible large amount of noise that can get wors performance due to this outlier points. In this project, its build a model to fit noisy linear data.

Language:C++100