MissPenguin's starred repositories

gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Language:PythonLicense:GPL-3.0Stargazers:59064Issues:248Issues:1466

llama

Inference code for LLaMA models

Language:PythonLicense:NOASSERTIONStargazers:50895Issues:499Issues:872

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:39599Issues:395Issues:1285

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:34871Issues:347Issues:1678

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:28956Issues:340Issues:266

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:24505Issues:694Issues:0

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18264Issues:156Issues:467

ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Language:PythonLicense:NOASSERTIONStargazers:17783Issues:88Issues:214

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

gpt4-pdf-chatbot-langchain

GPT4 & LangChain Chatbot for large PDF docs

PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)

Language:PythonLicense:Apache-2.0Stargazers:12553Issues:184Issues:1291

Awesome-Chinese-LLM

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

Language:PythonLicense:Apache-2.0Stargazers:9308Issues:91Issues:115

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:8862Issues:156Issues:531

prompt-engineering-for-developers

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

Language:Jupyter NotebookStargazers:8484Issues:79Issues:25

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4308Issues:88Issues:10

FastDeploy

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.

Language:C++License:Apache-2.0Stargazers:2752Issues:55Issues:1108

Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2498Issues:37Issues:97

pdf2docx

Open source Python library for converting PDF to DOCX.

Language:PythonLicense:AGPL-3.0Stargazers:2211Issues:24Issues:234

DecryptPrompt

总结Prompt&LLM论文,开源数据&模型,AIGC应用

MOSS-RLHF

MOSS-RLHF

Language:PythonLicense:Apache-2.0Stargazers:1187Issues:33Issues:50

Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。

Language:PythonLicense:NOASSERTIONStargazers:1118Issues:21Issues:61

OCR-SAM

Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting

benchmarking-chinese-text-recognition

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Language:PythonLicense:MITStargazers:390Issues:5Issues:25

InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Language:PythonLicense:MITStargazers:197Issues:9Issues:14

PaddleOCR-AutoHotkey

PaddleOCR AutoHotkey Version. PaddleOCR AHK 版。

OCR_preprocessing_tool

A simple OCR preprocessing tool using Python with a GUI.

Language:PythonLicense:MITStargazers:27Issues:1Issues:0

optlab

OCR pre-processing Toolbox

PaddleOCR-Quicker

GUI for PaddleOCR whl based on Quicker