Tianhe Wu (TianheWu)

TianheWu

Geek Repo

Company:Tsinghua University

Location:Beijing

Github PK Tool:Github PK Tool

Tianhe Wu's starred repositories

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

License:CC-BY-4.0Stargazers:1Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0

LeetcodeTop

汇总各大互联网公司容易考察的高频leetcode题🔥

Stargazers:18409Issues:0Issues:0

Q-Ground

Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)

License:NOASSERTIONStargazers:17Issues:0Issues:0
Language:PythonStargazers:1400Issues:0Issues:0

MEFNet

Official Implementation of MEF-Net

Language:PythonStargazers:83Issues:0Issues:0

3dpe

[ECCV 2024] 3DPE: Real-time 3D-aware Portrait Editing from a Single Image

Stargazers:16Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9387Issues:0Issues:0

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonLicense:Apache-2.0Stargazers:2312Issues:0Issues:0

HiDiffusion

[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:714Issues:0Issues:0

CoSeR

An unofficial implementation for "CoSeR: Bridging Image and Language for Cognitive Super-Resolution (CVPR 2024)"

Language:PythonLicense:MITStargazers:20Issues:0Issues:0

Awesome-High-Resolution-Diffusion

🔥🔥🔥A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.

Stargazers:39Issues:0Issues:0
License:Apache-2.0Stargazers:59Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1614Issues:0Issues:0

StyleCrafter-SDXL

Code of StyleCrafter on SDXL

Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

LM4LV

🔥Official PyTorch implementation for "LM4LV: A Frozen Large Language Model for Low-level Vision Tasks".

Language:PythonLicense:Apache-2.0Stargazers:30Issues:0Issues:0

Diff-Plugin

[CVPR 2024] Official code release of our paper "Diff-Plugin: Revitalizing Details for Diffusion-based Low-level tasks"

Language:PythonStargazers:104Issues:0Issues:0

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:PythonLicense:MITStargazers:37866Issues:0Issues:0

CaD-VI

Comparison Visual Instruction Tuning (CaD-VI)

Language:PythonStargazers:4Issues:0Issues:0

ChartMimic

ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation

Language:PythonStargazers:67Issues:0Issues:0

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonLicense:Apache-2.0Stargazers:348Issues:0Issues:0
Language:PythonStargazers:103Issues:0Issues:0

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonLicense:MITStargazers:1095Issues:0Issues:0

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8131Issues:0Issues:0

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonLicense:NOASSERTIONStargazers:2960Issues:0Issues:0

MMDialog

The official site of paper MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation

Language:PythonStargazers:181Issues:0Issues:0

CCSR

Official codes of CCSR: Improving the Stability of Diffusion Models for Content Consistent Super-Resolution

Language:PythonStargazers:400Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4512Issues:0Issues:0

HDC

The official implementation of Hierarchical Semantic Decoding with Counting Assitance for Generalized Referring Expression Segmentation

License:MITStargazers:14Issues:0Issues:0