CVIP (CV-IP)

CVIP

CV-IP

Geek Repo

computer vision and image processing projects

Location:chengdu,sichuan

Github PK Tool:Github PK Tool

CVIP's repositories

APGCC

ECCV24 - Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

License:MITStargazers:0Issues:0Issues:0

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

Stargazers:0Issues:0Issues:0

BasicPBC

Official Implementation of "Learning Inclusion Matching for Animation Paint Bucket Colorization"

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

E2STR

The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

License:Apache-2.0Stargazers:0Issues:0Issues:0

EfficientTrain

1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.

License:MITStargazers:0Issues:0Issues:0

hriq

High Resolution Image Quality (HRIQ) database and model

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

MDKNet

Modulating Domain-Specific Knowledge for Multi-domain Crowd Counting

Stargazers:0Issues:0Issues:0

mgc

The official implementation of paper: "Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning"

License:NOASSERTIONStargazers:0Issues:0Issues:0

MLoRE

Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"

Stargazers:0Issues:0Issues:0

MobileAgent

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

License:MITStargazers:0Issues:0Issues:0

MPCount

Official repo for CVPR2024 paper "Single Domain Generalization for Crowd Counting"

License:Apache-2.0Stargazers:0Issues:0Issues:0

Official_Remote_Sensing_Mamba

Official code of Remote Sensing Mamba

Stargazers:0Issues:0Issues:0

PIIP

Parameter-Inverted Image Pyramid Networks (PIIP)

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

PromptAlign

[NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization

Stargazers:0Issues:0Issues:0

Q-Bench

①[ICLR2024 Spotlight] (GPT-4V/Gemini-Pro/Qwen-VL-Plus+16 OS MLLMs) A benchmark for multi-modality LLMs (MLLMs) on low-level vision and visual quality assessment.

Stargazers:0Issues:0Issues:0

Rewrite-the-Stars

[CVPR 2024] Rewrite the Stars

License:Apache-2.0Stargazers:0Issues:0Issues:0

RWKV-CLIP

The official code of "RWKV-CLIP: A Robust Vision-Language Representation Learner"

License:MITStargazers:0Issues:0Issues:0

RWKV-infctx-trainer

RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!

License:Apache-2.0Stargazers:0Issues:0Issues:0

Shadow_R

This is the official PyTorch implementation of ShadowRefiner. Our method is winner of Perceptual Track and achieves the second-best performance for Fidelity Track in NTIRE 2024 Shadow Removal Challenge (CVPR 2024 Workshop)

License:MITStargazers:0Issues:0Issues:0

StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

License:MITStargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

TSCM

[ICRA24] TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

License:Apache-2.0Stargazers:0Issues:0Issues:0

VisualRWKV

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.

License:Apache-2.0Stargazers:0Issues:0Issues:0

ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

License:Apache-2.0Stargazers:0Issues:0Issues:0