LI Minghan (MinghanLi)

MinghanLi

Geek Repo

Company:Hong Kong Polytechnic University

Location:Hong Kong

Home Page:https://sites.google.com/view/minghanli-homepage/academic

Github PK Tool:Github PK Tool

LI Minghan's starred repositories

make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

Language:PythonLicense:MITStargazers:1889Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5909Issues:0Issues:0
Language:PythonLicense:MITStargazers:46Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9964Issues:0Issues:0

yolact

A simple, fully convolutional model for real-time instance segmentation.

Language:PythonLicense:MITStargazers:4997Issues:0Issues:0

LaGOT

We enrich the LaSOT validation set with annotations of additional object tracks, up to 10 object tracks per video in total. Tracks consist of precise bounding box annotations of moving objects. Annotations are provided at 10 fps. The original LaSOT validation set annotations and video can be downloaded from: https://vision.cs.stonybrook.edu/~lasot/

Language:PythonLicense:CC-BY-4.0Stargazers:6Issues:0Issues:0

pytracking

Visual tracking library based on PyTorch.

Language:PythonLicense:GPL-3.0Stargazers:3178Issues:0Issues:0

FDL

[CVPR-2024] Pytorch implementation of "Misalignment-Robust Frequency Distribution Loss for Image Transformation"

Language:PythonStargazers:28Issues:0Issues:0

CCSR

Official codes of CCSR: Improving the Stability of Diffusion Models for Content Consistent Super-Resolution

Language:PythonStargazers:417Issues:0Issues:0

SeeSR

[CVPR2024] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

Language:PythonLicense:Apache-2.0Stargazers:373Issues:0Issues:0
Language:PythonStargazers:91Issues:0Issues:0

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonLicense:Apache-2.0Stargazers:31762Issues:0Issues:0

VIPOSeg-Benchmark

The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".

Language:PythonStargazers:10Issues:0Issues:0

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2947Issues:0Issues:0

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:481Issues:0Issues:0

POPE

The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''

Language:PythonLicense:MITStargazers:160Issues:0Issues:0

LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1411Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:7063Issues:0Issues:0

dataset

The Open Images dataset

Language:PythonLicense:Apache-2.0Stargazers:4242Issues:0Issues:0

magvit

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

Language:PythonLicense:Apache-2.0Stargazers:927Issues:0Issues:0

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Language:PythonStargazers:2855Issues:0Issues:0

ControlVideo

[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"

Language:PythonLicense:MITStargazers:752Issues:0Issues:0

CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonLicense:Apache-2.0Stargazers:5875Issues:0Issues:0

webvid

Large-scale text-video dataset. 10 million captioned short videos.

Language:PythonStargazers:564Issues:0Issues:0

LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Language:PythonLicense:MITStargazers:664Issues:0Issues:0

Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Language:PythonStargazers:233Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:552Issues:0Issues:0

gigagan-pytorch

Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs

Language:PythonLicense:MITStargazers:1785Issues:0Issues:0

Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Language:PythonStargazers:1691Issues:0Issues:0

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8656Issues:0Issues:0