Beast code in Giters

Cao Lijun's starred repositories

LabelLLM

The Open-Source Data Annotation Platform

Language:TypeScriptApache-2.025200

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonApache-2.0369600

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Language:PythonAGPL-3.0596100

CodeGeeX4

CodeGeeX4-ALL-9B, a versatile model for all AI software development scenarios, including code completion, code interpreter, web search, function calling, repository-level Q&A and much more.

Language:PythonApache-2.081600

byte-unixbench

Automatically exported from code.google.com/p/byte-unixbench

Language:CGPL-2.03300

libnvidia-container

NVIDIA container runtime library

Language:CApache-2.078100

self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合**宝宝的部署教程

Language:Jupyter NotebookApache-2.0690200

gorilla

Gorilla: An API store for LLMs

Language:PythonApache-2.01097000

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookNOASSERTION2391500

llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

Language:Jupyter NotebookNOASSERTION74800

d2l-zh

《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

Language:PythonApache-2.06006200

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonApache-2.0402600

it-tools-zh_CN

为开发人员提供的方便的在线工具集合，具有出色的用户体验。【深度适配简体中文】

Language:VueGPL-3.04600

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell665600

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数，训练数据，评估数据，评估方法。

Language:PythonNOASSERTION119000

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonBSD-3-Clause784800

InternLM

Official release of InternLM2.5 7B base and chat models. 1M context support

Language:PythonApache-2.0593200

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.0345300

COIG-CQIA

7300

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

Language:GoMIT8308700

chinese-llm-benchmark

中文大模型能力评测榜单：目前已囊括106个大模型，覆盖chatgpt、gpt4o、百度文心一言、阿里通义千问、讯飞星火、商汤senseChat、minimax等商用模型，以及百川、qwen2、glm4、yi、书生internLM2、llama3等开源大模型，多维度能力评测。不仅提供能力评分排行榜，也提供所有模型的原始输出结果！

201800

Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

1389500

inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Language:PythonApache-2.0398300

gewas