Jiayi Zhou (Gaiejj)

Gaiejj

Geek Repo

Company:Peking University

Location:Beijing

Github PK Tool:Github PK Tool


Organizations
PKU-Alignment
PKU-MARL

Jiayi Zhou's starred repositories

mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Language:PythonLicense:MITStargazers:538Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18747Issues:0Issues:0

safety-rbr-code-and-data

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Language:Jupyter NotebookLicense:MITStargazers:111Issues:0Issues:0

ProgressGym

Alignment with a millennium of moral progress.

Language:PythonLicense:MITStargazers:7Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:599Issues:0Issues:0

SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Language:PythonLicense:NOASSERTIONStargazers:652Issues:0Issues:0

denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Language:PythonLicense:MITStargazers:7748Issues:0Issues:0

DRLX

Diffusion Reinforcement Learning Library

Language:PythonLicense:MITStargazers:169Issues:0Issues:0

Dense_Reward_T2I

Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).

Language:PythonStargazers:24Issues:0Issues:0

VideoElevator

[Arxiv 2024] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models"

Language:PythonStargazers:135Issues:0Issues:0

SPO

Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

Language:PythonStargazers:124Issues:0Issues:0

DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Language:PythonLicense:Apache-2.0Stargazers:213Issues:0Issues:0

ddpo-pytorch

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

Language:PythonLicense:MITStargazers:380Issues:0Issues:0

ddpo

Code for the paper "Training Diffusion Models with Reinforcement Learning"

Language:PythonLicense:MITStargazers:305Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:PythonLicense:MITStargazers:52947Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5869Issues:0Issues:0

VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.

Language:PythonStargazers:169Issues:0Issues:0

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23789Issues:0Issues:0

Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Language:PythonLicense:Apache-2.0Stargazers:4274Issues:0Issues:0

align-anything

Align Anything: Training Any Modality Model with Feedback

Language:PythonLicense:Apache-2.0Stargazers:78Issues:0Issues:0

RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Language:PythonStargazers:207Issues:0Issues:0

LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF

Language:PythonLicense:GPL-3.0Stargazers:291Issues:0Issues:0

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonLicense:BSD-3-ClauseStargazers:3151Issues:0Issues:0

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonLicense:NOASSERTIONStargazers:8157Issues:0Issues:0

richhf-18k

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or images are included in this dataset).

Stargazers:87Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:11166Issues:0Issues:0

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:6905Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:29082Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1901Issues:0Issues:0

mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

Language:PythonLicense:LGPL-3.0Stargazers:382Issues:0Issues:0