MissPenguin's starred repositories
gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
paper-reading
深度学习经典、新论文逐段精读
alpaca-lora
Instruct-tune LLaMA on consumer hardware
Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
gpt4-pdf-chatbot-langchain
GPT4 & LangChain Chatbot for large PDF docs
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Megatron-LM
Ongoing research training transformer models at scale
prompt-engineering-for-developers
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
FastDeploy
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
DecryptPrompt
总结Prompt&LLM论文,开源数据&模型,AIGC应用
Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
benchmarking-chinese-text-recognition
This repository contains datasets and baselines for benchmarking Chinese text recognition.
InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
PaddleOCR-AutoHotkey
PaddleOCR AutoHotkey Version. PaddleOCR AHK 版。
OCR_preprocessing_tool
A simple OCR preprocessing tool using Python with a GUI.
PaddleOCR-Quicker
GUI for PaddleOCR whl based on Quicker