TonyUSTC

TonyUSTC

Geek Repo

Company:baidu

Location:beijing

Github PK Tool:Github PK Tool

TonyUSTC's starred repositories

leetcode

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解

Language:JavaLicense:CC-BY-SA-4.0Stargazers:29829Issues:307Issues:43

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:22407Issues:202Issues:3364

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:PythonLicense:MITStargazers:21981Issues:502Issues:2432

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:9739Issues:66Issues:103

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:9338Issues:157Issues:590

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:7887Issues:75Issues:280

Synonyms

:herb: 中文近义词:聊天机器人,智能问答工具包

Language:PythonLicense:NOASSERTIONStargazers:4995Issues:174Issues:128

stopwords

中文常用停用词表(哈工大停用词表、百度停用词表等)

text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Language:PythonLicense:Apache-2.0Stargazers:4252Issues:30Issues:146

CLUE

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1834Issues:21Issues:82

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonLicense:BSD-3-ClauseStargazers:1447Issues:11Issues:139

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonLicense:Apache-2.0Stargazers:915Issues:18Issues:81

BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Language:PythonLicense:NOASSERTIONStargazers:811Issues:14Issues:27

Final_word_Similarity

综合了同义词词林扩展版与知网(Hownet)的词语相似度计算方法,词汇覆盖更多、结果更准确。

Language:PythonLicense:MITStargazers:713Issues:15Issues:10

Advances-in-Label-Noise-Learning

A curated (most recent) list of resources for Learning with Noisy Labels

Megatron-LLaMA

Best practice for training LLaMA models in Megatron-LM

Language:PythonLicense:NOASSERTIONStargazers:571Issues:6Issues:58

DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Language:PythonLicense:MITStargazers:521Issues:9Issues:53

OmniEvent

A comprehensive, unified and modular event extraction toolkit.

Language:PythonLicense:MITStargazers:331Issues:10Issues:25

HiAGM

Hierarchy-Aware Global Model for Hierarchical Text Classification

Language:PythonLicense:MITStargazers:202Issues:8Issues:16

SCELoss-Reproduce

Reproduce Results for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels" https://arxiv.org/abs/1908.06112

contrastive-htc

This repository implements a contrastive learning model for hierarchical text classification. This work has been accepted as the long paper "Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification" in ACL 2022.

Language:PythonLicense:MITStargazers:122Issues:2Issues:32

NLPDataAugmentation

Chinese NLP Data Augmentation, BERT Contextual Augmentation

NLP-Data-Augmentation

NLP文本增强的两种方式:同义词替换(利用word2vec词表)和回译

lnl_sr

Learning with Noisy Labels via Sparse Regularization, ICCV2021

hierarchical-multi-label-text-classification

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

replaceSynbycilin

利用哈工大同义词林替换问答文本内的同义词进行语料扩充

Language:PythonLicense:MITStargazers:35Issues:0Issues:1

NLP_Chinese_data_Augment

中文数据增强封装类:同义词替换、随机插入、随机交换、随机删除

Language:PythonLicense:Apache-2.0Stargazers:4Issues:2Issues:0

RANSAC

⭐ RANSAC is an algorithm used for fiting models with posible large amount of noise that can get wors performance due to this outlier points. In this project, its build a model to fit noisy linear data.

Language:C++Stargazers:1Issues:0Issues:0