baoyb's repositories
2024-AAAI-HPT
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)
Awesome-Scene-Text-Image-Super-Resolution
A collection of papers and resources on scene text image super-resolution.
BiFormer
[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
CloFormer
The official code of "Rethinking Local Perception in Lightweight Vision Transformer"
Contrastive-Learning-NLP-Papers
Paper List for Contrastive Learning for Natural Language Processing
darknet
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
DDP-practice
A demo of image classification with PyTorch DDP (DistributedDataParallel) and amp (Automatic Mixed Precision) modules. TODO: Add English version
FashionTex
The official implementation of SIGGRAPH 2023 conference paper, FashionTex: Controllable Virtual Try-on with Text and Texture.
Fast-BEV
Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline
GLCNet
Official implementation of "Global-Local Context Network for Person Search" in PyTorch.
Graphormer
Do Transformers Really Perform Bad for Graph Representation? [NIPS-2021]
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
MambaIR
A simple baseline for image restoration with state-space model.
MIGC
[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)
MMIF-CDDFuse
[CVPR 2023] Official implementation for "CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion."
mobile-vision
Mobile vision models and code
MSINet
[CVPR2023] Twins Contrastive Search of Multi-Scale Interaction for Object Re-Identification
OpenGait
A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.
personal-paper-code-daily
🎓 Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)
Point-cloud-quality-assessment
Collections of papers, databases, and codes targeted at point cloud quality assessment (PCQA), mesh quality assessment (MQA), 3D model quality assessment (3DQA).
Qwen-7B
The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.
qwen-sft
通义千问 SFT试验
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
SDT
This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR23).
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
SOLIDER
A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximum extent
VTG-GPT
VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Zero-shot-RIS
[CVPR 2023] Official code for "Zero-shot Referring Image Segmentation with Global-Local Context Features"