lc's starred repositories
text-generation-inference
Large Language Model Text Generation Inference
chinese-chatbot-corpus
中文公开聊天语料库
Luotuo-Chinese-LLM
骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技
ChineseSubFinder
自动化中文字幕下载。字幕网站支持 shooter、xunlei、arrst、a4k、SubtitleBest 。支持 Emby、Jellyfin、Plex、Sonarr、Radarr、TMM
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
NLP-Interview-Notes
该仓库主要记录 NLP 算法工程师相关的面试题
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
anime-character-chinese-dataset
二次元角色中文语料库
data-toolbox
Our data munging code.
ChatGPT_Role-play_Dataset
This repository contains the ChatGPT Roleplay Dataset (CRD), which includes conversations with ChatGPT 3.5 in different scenarios, annotated to understand user intentions and the naturalness of model responses.