lxtGH

followers

following

stars

Bytedance

Singapore

https://lxtgh.github.io/

Xiangtai Li's starred repositories

video-ttt-release

Language:PythonMIT5100

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

kmax-deeplab

a PyTorch re-implementation of ECCV 2022 paper based on Detectron2: k-means mask Transformer.

Language:PythonApache-2.06400

Prompt-Diffusion

Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"

Language:PythonApache-2.035800

GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Language:PythonNOASSERTION47100

StableSR

Exploiting Diffusion Prior for Real-World Image Super-Resolution

Language:PythonNOASSERTION191800

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01743100

InternLM

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Language:PythonApache-2.0543900

CrossKD

CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection

Language:PythonNOASSERTION11600

VoxFormer

Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]

Language:PythonNOASSERTION98200

StreamPETR

[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

Language:PythonNOASSERTION50000

UniAD

[CVPR 2023 Best Paper] Planning-oriented Autonomous Driving

Language:PythonApache-2.0299900

Point-In-Context

[NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding

Language:Python6000

OmniObject3D

[ CVPR 2023 Award Candidate ] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

Language:Python42500

ULIP

Language:PythonBSD-3-Clause37200

InternLM-techreport

hiera

Hiera: A fast, powerful, and simple hierarchical vision transformer.

Language:PythonApache-2.070800

Voyager

An Open-Ended Embodied Agent with Large Language Models

Language:JavaScriptMIT528200

ContextDET

Contextual Object Detection with Multimodal Large Language Models

NOASSERTION16800

InternVideo

Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.0105000

InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonApache-2.0315500

DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Language:PythonNOASSERTION3529200

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonGPL-3.0558200

learning_research

本人的科研经验

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonNOASSERTION799700

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonMIT352200

SegmentAnyRGBD

Segment Any RGBD

Language:PythonNOASSERTION73600

Multimodal-GPT

Multimodal-GPT

Language:PythonApache-2.0143200

eqlv2

The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection. https://arxiv.org/abs/2012.08548 https://arxiv.org/abs/2003.05176

Language:PythonApache-2.015300

BasicSR

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.

Language:PythonApache-2.0635300