Zhaoyang Liu's starred repositories
openai-python
The official Python library for the OpenAI API
llm-foundry
LLM training code for Databricks foundation models
promptsource
Toolkit for creating, sharing and using natural language prompts.
awesome-public-datasets
A topic-centric list of HQ open datasets.
ChineseNlpCorpus
搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.
LangChain-Chinese-Getting-Started-Guide
LangChain 的中文入门教程
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Pix2Text
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
PaLM-pytorch
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways