tianyu-z

Tianyu Zhang's repositories

VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Language:PythonCC-BY-SA-4.0800

alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Language:Jupyter NotebookMIT1 10

VCR-wiki-en-easy-test-500

Raw data for VCR-wiki-en-easy-test-500 from https://huggingface.co/datasets/vcr-org/VCR-wiki-en-easy-test-500

CC-BY-SA-4.0100

VCR-wiki-zh-easy-test-500

Raw data for VCR-wiki-zh-easy-test-100 from https://huggingface.co/datasets/vcr-org/VCR-wiki-zh-easy-test-100

CC-BY-SA-4.0100

VCR-wiki-zh-hard-test-500

Raw data for VCR-wiki-zh-hard-test-500 from https://huggingface.co/datasets/vcr-org/VCR-wiki-zh-hard-test-500

CC-BY-SA-4.0100

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Apache-2.0000

Best-README-Template

An awesome README template to jumpstart your projects!

MIT000

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonApache-2.0000

Connect-4-Gym-env-Reinforcement-learning

Connect Four Environment is a project designed for training reinforcement learning models to play the classic Connect4 game. It's compatible with OpenAI Gym / Gymnasium, includes a variety of bots, an Elo leaderboard system, and supports both FCN and CNN policies.

Language:PythonMIT000

dreamerv3

Mastering Diverse Domains through World Models

MIT000

EfficientZero

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

GPL-3.0000

gfn-lm-tuning

Language:Jupyter NotebookMIT000

GreenBond_Notes

Language:Python010

Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Apache-2.0000

light_on_chatgpt

Good for e-ink monitor user to use ChatGPT. It makes the code blocks white and makes the UI wider.

Language:CSSMIT010

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonNOASSERTION000

maze-transformer

This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.

Language:Jupyter Notebook000

MergeLM

Codebase for Merging Language Models

000

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Apache-2.0000

multipleWindow3dScene

A quick example of how one can "synchronize" a 3d scene across multiple windows using three.js and localStorage

Language:JavaScriptMIT000

pykan

Kolmogorov Arnold Networks

MIT000

pymdp

A Python implementation of active inference for Markov Decision Processes

MIT000

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonApache-2.0000

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION000

surya

OCR, layout analysis, reading order, line detection in 90+ languages

GPL-3.0000

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

MIT000

tianyu-z

Tianyu Zhang's repositories

VCR

cloudimage

alpha-zero-general

VCR-wiki-en-easy-test-500

VCR-wiki-zh-easy-test-500

VCR-wiki-zh-hard-test-500

AlphaCLIP

Best-README-Template

CogVLM2

Connect-4-Gym-env-Reinforcement-learning

dreamerv3

EfficientZero

gfn-lm-tuning

GreenBond_Notes

Grounded-Segment-Anything

light_on_chatgpt

lmms-eval

maze-transformer

MergeLM

mPLUG-DocOwl

multipleWindow3dScene

pykan

pymdp

Qwen

Qwen-Audio

surya

VAR

VCR-wiki-en-hard-test-500

whisper

Yuan-2.0