whuhxb

Xiaobing Han's repositories

AI2BMD

AI-powered ab initio biomolecular dynamics simulation

MIT000

AirSLAM

🚀 AirVO upgrades to AirSLAM 🚀

GPL-3.0000

BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

MIT000

clickattention

Apache-2.0000

ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

MIT000

composio

Composio equips agents with well-crafted tools empowering them to tackle complex tasks

NOASSERTION000

ControlNeXt

Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA

Apache-2.0000

crab

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/

000

cv-model-compression

000

DeepInteraction

[NeurIPS 2022] DeepInteraction: 3D Object Detection via Modality Interaction

MIT000

generative-ai-1

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI

Apache-2.0000

generative-ai-python

The official Python library for the Google Gemini API

Apache-2.0000

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

MIT000

HAIR

The Official Implementation for "HAIR: Hypernetworks-based All-in-One Image Restoration".

000

lazygrounding

[ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

000

mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

Apache-2.0000

MetaSeg

MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation (Accepted by WACV 2024)

MIT000

MuCR

MuCR is a benchmark designed to evaluate Vision Large Language Models' (VLLMs) ability to infer causal relationships using only visual cues

000

PolyGNN-1

000

PromptClip

Instantly create video clips from LLM prompts

MIT000

ProxyCLIP

[ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

000

RISurConv

Official codes for ECCV2024 paper: RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation

MIT000

SC4D

[ECCV 2024] Official code for: SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

000

sg3d

MIT000

skydiffusion

Apache-2.0000

SLCA

Codes for ICCV 2023 paper: SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model

MIT000

SpatialBot

The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.

MIT000

unic

PyTorch code and pretrained weights for the UNIC models.

NOASSERTION000

vfusion3d-1

[ECCV 2024] Code for VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

NOASSERTION000

VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

000