Zechen Bai (JosephPai)

JosephPai

Geek Repo

Company:Amazon

Location:Shanghai

Home Page:www.baizechen.site

Github PK Tool:Github PK Tool

Zechen Bai's starred repositories

ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Language:PythonLicense:GPL-3.0Stargazers:48452Issues:365Issues:2930

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:36473Issues:383Issues:67

Tvlist-awesome-m3u-m3u8

直播源相关资源汇总 📺 💯 IPTV、M3U —— 勤洗手、戴口罩,祝愿所有人百毒不侵

Gooey

Turn (almost) any Python command line program into a full GUI application with one line

Language:PythonLicense:MITStargazers:20542Issues:281Issues:599

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫

Language:PythonLicense:NOASSERTIONStargazers:16125Issues:99Issues:268

AnimateAnyone

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

Language:PythonLicense:Apache-2.0Stargazers:7064Issues:66Issues:70

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5915Issues:47Issues:78

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5231Issues:39Issues:37

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:4309Issues:45Issues:188

LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

Language:PythonLicense:MITStargazers:3519Issues:54Issues:106

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonLicense:NOASSERTIONStargazers:2594Issues:37Issues:52

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1970Issues:19Issues:46

DeepSeek-LLM

DeepSeek LLM: Let there be answers

Language:MakefileLicense:MITStargazers:1373Issues:24Issues:32

GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Language:PythonLicense:MITStargazers:1012Issues:47Issues:38

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:866Issues:19Issues:107

Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language:PythonLicense:Apache-2.0Stargazers:740Issues:14Issues:39

ml-aim

This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models

Language:PythonLicense:NOASSERTIONStargazers:677Issues:18Issues:5

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:258Issues:5Issues:43

syncedlyrics

Get an LRC format (synchronized) lyrics for your music

Language:PythonLicense:MITStargazers:218Issues:6Issues:39

imp

a family of highly capabale yet efficient large multimodal models

Language:PythonLicense:Apache-2.0Stargazers:154Issues:5Issues:7

LVVIS

Large-Vocabulary Video Instance Segmentation dataset

Language:PythonLicense:GPL-3.0Stargazers:73Issues:4Issues:19

NeurIPS2023_SOC

[NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

BYOC

[IEEE-VR 2024] Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

Language:PythonLicense:Apache-2.0Stargazers:5Issues:2Issues:0