Beast code in Giters

nuoline's starred repositories

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

1139400

UltraEval

[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.

Language:PythonApache-2.020500

Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

1446000

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT329600

llamafile

Distribute and run LLMs with a single file.

Language:C++NOASSERTION1870900

OpenBot leverages smartphones as brains for low-cost robots. We have designed a small electric vehicle that costs about $50 and serves as a robot body. Our software stack for Android smartphones supports advanced robotics workloads such as person following and real-time autonomous navigation.

Language:SwiftMIT281200

Safety-Prompts

Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts，用于评估和提升大模型的安全性。

Apache-2.082500

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.0362100

grok-1

Grok open release

Language:PythonApache-2.04938600

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03852700

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonApache-2.0819300

llama.cpp

LLM inference in C/C++

Language:C++MIT6416100

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonApache-2.01810000

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonApache-2.02967100

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT3591300

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT896900

Chinese-Mixtral-8x7B

中文Mixtral-8x7B（Chinese-Mixtral-8x7B）

Language:PythonApache-2.063800

fast-llama

Runs LLaMA with Extremely HIGH speed

Language:C++MIT8200

WebCPM

Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"

Language:HTMLApache-2.097000

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

MIT333000

Llama-Chinese

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用

Language:Python1349100

WenetSpeech

A 10000+ hours dataset for Chinese speech recognition

Language:ShellApache-2.048700

whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Language:Jupyter NotebookApache-2.0432700

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.01136900

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:PythonMIT3818900

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.013864900

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookNOASSERTION6727200

GLM-130B

GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

Language:PythonApache-2.0765200

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++MPL-2.02497500

GLM

GLM (General Language Model)

Language:PythonMIT315000