Shen Hao's repositories
awesome-mpc
A curated list of multi party computation resources and links.
paper_collector
🧐Fully-automated scripts for collecting CS-related papers including arxiv and dblp
AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Awesome-Domain-LLM
收集和梳理垂直领域的开源模型、数据集及评测基准。
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
bertviz
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
ChineseNLPCorpus
中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。
DecryptPrompt
总结Prompt&LLM论文,开源数据&模型,AIGC应用
embedchain
The Open Source RAG framework
FastChat
An open platform for training, serving, and evaluating large languages. Release repo for Vicuna and FastChat-T5.
Firefly
Firefly(流萤): 中文对话式大语言模型(全量微调+QLoRA),支持微调Mixtral-8x7B、Zephyr、Mistral、Aquila2、Baichuan2、CodeLlama、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya、Bloom等大模型
Hand-on-RAG
顾名思义:手搓的RAG
HiGPT
[KDD'2024] "HiGPT: Heterogenous Graph Language Models"
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain
learning_research
本人的科研经验
Lets-Verify-Step-by-Step
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
LLM-Continual-Learning-Papers
Must-read Papers on Large Language Model (LLM) Continual Learning
LLMAgentPapers
Must-read Papers on LLM Agents.
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
picgo
shenhao picgo素材库
Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
raft
RAFT, or Retrieval-Augmented Fine-Tuning, is a method comprising of a fine-tuning and a RAG-based retrieval phase. It is particularly suited for the creation of agents that realistically emulate a specific human target.
Risk_Finance
🤪A Repo for generating data on new types of illegal fundraising activities.
wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets