zhaobingbingbing's starred repositories

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

License:Apache-2.0Stargazers:639Issues:0Issues:0

MAmmoTH

Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)

Language:Jupyter NotebookStargazers:289Issues:0Issues:0

evol-teacher

Open Source WizardCoder Dataset

Language:PythonLicense:Apache-2.0Stargazers:138Issues:0Issues:0

da-fusion

Effective Data Augmentation With Diffusion Models

Language:PythonLicense:MITStargazers:173Issues:0Issues:0

AIGS

AI-Generated Images as Data Source: The Dawn of Synthetic Era

Language:TeXStargazers:136Issues:0Issues:0

GPTeacher

A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer

Language:PythonLicense:MITStargazers:1574Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:17404Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookLicense:MITStargazers:49752Issues:0Issues:0

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonLicense:BSD-3-ClauseStargazers:2987Issues:0Issues:0

leetcode

Solutions for LeetCode by Python with unittest

Language:PythonStargazers:10Issues:0Issues:0

abdominal_ultrasound_classification

Combining deep neural networks with PCA and k-NN classification for abdominal organ recognition in ultrasound images.

Language:PythonLicense:MITStargazers:15Issues:0Issues:0

EchoDiffusion

MICCAI 2023 code for the paper: Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis. EchoDiffusion is a collection of video diffusion models trained from scratch on the EchoNet-Dynamic dataset with the imagen-pytorch repo.

Language:PythonLicense:MITStargazers:45Issues:0Issues:0

echo_from_noise

Code to implement "Echo from noise: synthetic ultrasound image generation using diffusion models for real image segmentation"

Language:PythonLicense:MITStargazers:26Issues:0Issues:0

chest-xray-synthesis

Using GANs and Stable Diffusion to generate Chest Xray data points and evaluating them using convolutional classifiers.

Language:Jupyter NotebookLicense:MITStargazers:15Issues:0Issues:0

codellama

Inference code for CodeLlama models

Language:PythonLicense:NOASSERTIONStargazers:15348Issues:0Issues:0

XrayGLM

🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.

Language:PythonLicense:NOASSERTIONStargazers:833Issues:0Issues:0

Awesome-Chinese-LLM

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

Stargazers:12397Issues:0Issues:0

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Language:PythonLicense:MITStargazers:3493Issues:0Issues:0

Firefly

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Language:PythonStargazers:5024Issues:0Issues:0

evolve-instruct

evolve llm training instruction, from english instruction to any language.

Language:PythonStargazers:99Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29026Issues:0Issues:0

Chinese_from_dongxiexidian

mirror of dongxiexidian/Chinese

Language:PythonStargazers:277Issues:0Issues:0

llm

The Roadmap for LLMs

Stargazers:82Issues:0Issues:0

self-instruct

Aligning pretrained language models with instruction data generated by themselves.

Language:PythonLicense:Apache-2.0Stargazers:3877Issues:0Issues:0

self-instruct-zh

基于ChatGPT构建的中文self-instruct数据集

Stargazers:98Issues:0Issues:0

sft_datasets

开源SFT数据集整理,随时补充

Stargazers:376Issues:0Issues:0

LLMforDialogDataGenerate

Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集

Language:PythonStargazers:125Issues:0Issues:0
Language:PythonLicense:MITStargazers:674Issues:0Issues:0

StableSR

Exploiting Diffusion Prior for Real-World Image Super-Resolution

Language:PythonLicense:NOASSERTIONStargazers:1917Issues:0Issues:0

RePaint

Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models", CVPR 2022

Language:PythonStargazers:1849Issues:0Issues:0