lc (lcvcl)

lcvcl

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

lc's starred repositories

MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Language:PythonLicense:MITStargazers:44061Issues:899Issues:635

ip2region

Ip2region (2.0 - xdb) is a offline IP address manager framework and locator, support billions of data segments, ten microsecond searching performance. xdb engine implementation for many programming languages

Language:GoLicense:Apache-2.0Stargazers:16948Issues:449Issues:241

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:16022Issues:109Issues:1049

searxng

SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.

Language:PythonLicense:AGPL-3.0Stargazers:12811Issues:113Issues:1301

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8864Issues:100Issues:1316

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:5434Issues:56Issues:546

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:4483Issues:47Issues:193

chinese-chatbot-corpus

中文公开聊天语料库

Language:PythonLicense:Apache-2.0Stargazers:3975Issues:75Issues:18

Luotuo-Chinese-LLM

骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3631Issues:55Issues:44

ChineseSubFinder

自动化中文字幕下载。字幕网站支持 shooter、xunlei、arrst、a4k、SubtitleBest 。支持 Emby、Jellyfin、Plex、Sonarr、Radarr、TMM

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

YAYI

雅意大模型:为客户打造安全可靠的专属大模型,基于大规模中英文多领域指令数据训练的 LlaMA 2 & BLOOM 系列模型,由中科闻歌算法团队研发。(Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM)

Language:PythonLicense:Apache-2.0Stargazers:3251Issues:12Issues:11

Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2593Issues:36Issues:100

NLP-Interview-Notes

该仓库主要记录 NLP 算法工程师相关的面试题

ReAct

[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models

Language:Jupyter NotebookLicense:MITStargazers:1901Issues:16Issues:29

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1680Issues:16Issues:394

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookLicense:MITStargazers:1667Issues:26Issues:51

dolma

Data and tools for generating and inspecting OLMo pre-training data.

Language:PythonLicense:Apache-2.0Stargazers:933Issues:20Issues:73

DMHY

Easily download/autodownload torrent(s) from share.dmhy.org/acg.rip etc. sites for OS X

Language:Objective-CLicense:MITStargazers:454Issues:18Issues:3

FuseAI

FuseAI Project

Language:PythonStargazers:440Issues:0Issues:0

LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Language:Jupyter NotebookLicense:MITStargazers:348Issues:5Issues:26

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:296Issues:4Issues:39
Language:PythonLicense:Apache-2.0Stargazers:288Issues:2Issues:7

Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

shu

中文书籍收录整理, Collection of Chinese Books

Language:PythonLicense:MITStargazers:171Issues:5Issues:3

FuseAI

FuseAI Project

anime-character-chinese-dataset

二次元角色中文语料库

data-toolbox

Our data munging code.

Language:PythonLicense:AGPL-3.0Stargazers:34Issues:5Issues:9

ChatGPT_Role-play_Dataset

This repository contains the ChatGPT Roleplay Dataset (CRD), which includes conversations with ChatGPT 3.5 in different scenarios, annotated to understand user intentions and the naturalness of model responses.

License:MITStargazers:3Issues:1Issues:0