orzlh

刘恒's starred repositories

exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Language:PythonGPL-3.0564500

composed-video-retrieval

Composed Video Retrieval

Language:PythonApache-2.04000

ControlNet-v1-1-nightly

Nightly release of ControlNet 1.1

Language:Python459600

SiTH

[CVPR 2024] SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion

Language:PythonMIT9500

T-MASS-text-video-retrieval

Official implementation of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"

Language:Python3700

xpool

https://layer6ai-labs.github.io/xpool/

Language:Python10900

CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Language:PythonMIT82900

Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

Language:PythonApache-2.027300

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language:PythonMIT418200

This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts, and visualization notebooks.

Language:PythonMIT26300

ghiaseddin

Author's implementation of the paper "Deep Relative Attributes" (ACCV 2016)

Language:Jupyter NotebookMIT4200

LaBo

CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Language:Python6600

LM4CV

The official implementation of the paper **Learning Concise and Descriptive Attributes for Visual Recognition**

Language:Python3800

classify_by_description_release

Language:Python15200

DUET

[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning

Language:PythonMIT4400

I2DFormer

Code for CVPR23 Highlight "I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification" and NeurIPS2022 "I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification"

Language:PythonGPL-3.01800

flickr_scraper

Simple Flickr Image Scraper

Language:PythonAGPL-3.020800