Tianyu Zhang's repositories
cloudimage
Personal
alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
VCR-wiki-en-easy-test-500
Raw data for VCR-wiki-en-easy-test-500 from https://huggingface.co/datasets/vcr-org/VCR-wiki-en-easy-test-500
VCR-wiki-zh-easy-test-500
Raw data for VCR-wiki-zh-easy-test-100 from https://huggingface.co/datasets/vcr-org/VCR-wiki-zh-easy-test-100
VCR-wiki-zh-hard-test-500
Raw data for VCR-wiki-zh-hard-test-500 from https://huggingface.co/datasets/vcr-org/VCR-wiki-zh-hard-test-500
AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Best-README-Template
An awesome README template to jumpstart your projects!
CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Connect-4-Gym-env-Reinforcement-learning
Connect Four Environment is a project designed for training reinforcement learning models to play the classic Connect4 game. It's compatible with OpenAI Gym / Gymnasium, includes a variety of bots, an Elo leaderboard system, and supports both FCN and CNN policies.
dreamerv3
Mastering Diverse Domains through World Models
EfficientZero
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
light_on_chatgpt
Good for e-ink monitor user to use ChatGPT. It makes the code blocks white and makes the UI wider.
lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
maze-transformer
This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.
MergeLM
Codebase for Merging Language Models
mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
multipleWindow3dScene
A quick example of how one can "synchronize" a 3d scene across multiple windows using three.js and localStorage
pykan
Kolmogorov Arnold Networks
pymdp
A Python implementation of active inference for Markov Decision Processes
Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
surya
OCR, layout analysis, reading order, line detection in 90+ languages
VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
VCR-wiki-en-hard-test-500
Raw data for VCR-wiki-en-hard-test-500 from https://huggingface.co/datasets/vcr-org/VCR-wiki-en-hard-test-500
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Yuan-2.0
Yuan 2.0 Large Language Model