Shuai Liu (choiszt)

choiszt

Geek Repo

Company:Nanyang Technological University

Location:Beijing

Github PK Tool:Github PK Tool

Shuai Liu's starred repositories

stitching

A Python package for fast and robust Image Stitching

Language:PythonLicense:Apache-2.0Stargazers:1941Issues:0Issues:0

mv-extractor

Extract frames and motion vectors from H.264 and MPEG-4 encoded video.

Language:CLicense:MITStargazers:270Issues:0Issues:0

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:462Issues:0Issues:0

video2game

Code release of Video2Game

Language:JavaScriptLicense:MITStargazers:290Issues:0Issues:0

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonLicense:NOASSERTIONStargazers:1164Issues:0Issues:0

TATS

Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)

Language:PythonLicense:MITStargazers:259Issues:0Issues:0

EgoThink

[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models"

Language:PythonLicense:Apache-2.0Stargazers:40Issues:0Issues:0

EgoVideo

[CVPR 2024 Champions] Solutions for EgoVis Chanllenges in CVPR 2024

Language:Jupyter NotebookStargazers:96Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1604Issues:0Issues:0

titok-pytorch

Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"

Language:PythonLicense:MITStargazers:154Issues:0Issues:0

EgoVLP

[NeurIPS2022] Egocentric Video-Language Pretraining

Language:PythonStargazers:220Issues:0Issues:0

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonLicense:BSD-3-ClauseStargazers:3030Issues:0Issues:0

VideoTree

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Language:PythonLicense:MITStargazers:53Issues:0Issues:0

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Stargazers:325Issues:0Issues:0

Interactive-Predicate-Learning

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning (RSS 2024)

Language:PythonLicense:MITStargazers:24Issues:0Issues:0
Language:PythonStargazers:789Issues:0Issues:0

LOVA3

The official repo of "Learning to Visual Question Answering, Asking and Assessment"

Language:PythonStargazers:9Issues:0Issues:0
Language:PythonStargazers:402Issues:0Issues:0

VQLoC

(NeurIPS 2023) Open-set visual object query search & localization in long-form videos

Language:PythonStargazers:18Issues:0Issues:0

self-infilling

[ICML 2024] Self-Infilling Code Generation

Language:PythonStargazers:16Issues:0Issues:0

CLIPS.jl

Cooperative Language-Guided Inverse Plan Search (CLIPS).

Language:JuliaStargazers:11Issues:0Issues:0

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:18332Issues:0Issues:0

Awesome-LLM4AD

A curated list of awesome LLM for Autonomous Driving resources (continually updated)

License:Apache-2.0Stargazers:803Issues:0Issues:0

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonLicense:NOASSERTIONStargazers:510Issues:0Issues:0

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1161Issues:0Issues:0

EgocentricVision

🔍 Explore Egocentric Vision: research, data, challenges, real-world apps. Stay updated & contribute to our dynamic repository! Work-in-progress; join us!

Stargazers:59Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:8114Issues:0Issues:0

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:3486Issues:0Issues:0

SwiftSage

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

Language:PythonStargazers:231Issues:0Issues:0

FreeVA

FreeVA: Offline MLLM as Training-Free Video Assistant

Language:PythonLicense:Apache-2.0Stargazers:38Issues:0Issues:0