baoyb (ahwhbc)

ahwhbc

Geek Repo

Location:Wuhu, China

Github PK Tool:Github PK Tool

baoyb's repositories

2024-AAAI-HPT

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Awesome-Scene-Text-Image-Super-Resolution

A collection of papers and resources on scene text image super-resolution.

Stargazers:0Issues:0Issues:0

BiFormer

[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"

License:MITStargazers:0Issues:0Issues:0

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

CloFormer

The official code of "Rethinking Local Perception in Lightweight Vision Transformer"

License:MITStargazers:0Issues:0Issues:0

Contrastive-Learning-NLP-Papers

Paper List for Contrastive Learning for Natural Language Processing

Stargazers:0Issues:0Issues:0

darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

License:NOASSERTIONStargazers:0Issues:0Issues:0

DDP-practice

A demo of image classification with PyTorch DDP (DistributedDataParallel) and amp (Automatic Mixed Precision) modules. TODO: Add English version

Stargazers:0Issues:0Issues:0

FashionTex

The official implementation of SIGGRAPH 2023 conference paper, FashionTex: Controllable Virtual Try-on with Text and Texture.

License:MITStargazers:0Issues:0Issues:0

Fast-BEV

Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline

License:NOASSERTIONStargazers:0Issues:0Issues:0

GLCNet

Official implementation of "Global-Local Context Network for Person Search" in PyTorch.

License:MITStargazers:0Issues:0Issues:0

Graphormer

Do Transformers Really Perform Bad for Graph Representation? [NIPS-2021]

Stargazers:0Issues:0Issues:0

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

MambaIR

A simple baseline for image restoration with state-space model.

License:Apache-2.0Stargazers:0Issues:0Issues:0

MIGC

[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)

License:NOASSERTIONStargazers:0Issues:0Issues:0

MMIF-CDDFuse

[CVPR 2023] Official implementation for "CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion."

Stargazers:0Issues:0Issues:0

mobile-vision

Mobile vision models and code

License:NOASSERTIONStargazers:0Issues:0Issues:0

MSINet

[CVPR2023] Twins Contrastive Search of Multi-Scale Interaction for Object Re-Identification

Stargazers:0Issues:0Issues:0

OpenGait

A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.

Stargazers:0Issues:0Issues:0

personal-paper-code-daily

🎓 Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Point-cloud-quality-assessment

Collections of papers, databases, and codes targeted at point cloud quality assessment (PCQA), mesh quality assessment (MQA), 3D model quality assessment (3DQA).

Stargazers:0Issues:0Issues:0

Qwen-7B

The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.

License:NOASSERTIONStargazers:0Issues:0Issues:0

qwen-sft

通义千问 SFT试验

Stargazers:0Issues:0Issues:0

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Stargazers:0Issues:0Issues:0

SDT

This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR23).

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

License:Apache-2.0Stargazers:0Issues:0Issues:0

SOLIDER

A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximum extent

License:Apache-2.0Stargazers:0Issues:0Issues:0

VTG-GPT

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

License:MITStargazers:0Issues:0Issues:0

Zero-shot-RIS

[CVPR 2023] Official code for "Zero-shot Referring Image Segmentation with Global-Local Context Features"

Stargazers:0Issues:0Issues:0