feymanpriv

followers

following

stars

BUPT

Beijing

yangmin09's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0128424 1099 15113

paper-reading

深度学习经典、新论文逐段精读

Apache-2.024894 7030

ConvNeXt

Code release for ConvNeXt model

Language:PythonMIT5620 33 130

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonMIT3425 30 250

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT2130 30 104

GLIP

Grounded Language-Image Pre-training

Language:PythonMIT2052 45 168

RecommenderSystem

detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Language:PythonApache-2.01895 26 155

fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

Language:PythonNOASSERTION1508 22 237

Video-Pre-Training

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Language:PythonMIT1241 28 31

VideoX

VideoX: a collection of video cross-modal models

Language:PythonNOASSERTION942 22 110

SimMIM

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Language:PythonMIT887 22 41

cv-arxiv-daily

🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonApache-2.0804 37 2

FastestDet

:zap: A newly designed ultra lightweight anchor free target detection algorithm， weight only 250K parameters， reduces the time consumption by 10% compared with yolo-fastest, and the post-processing is simpler

Language:PythonBSD-3-Clause747 12 38

CVPR-2022-Papers

self_supervised

A Pytorch-Lightning implementation of self-supervised algorithms

Language:PythonMIT523 12 13

XPretrain

Multi-modality pre-training

Language:PythonNOASSERTION451 14 35

QuadTreeAttention

QuadTree Attention for Vision Transformers (ICLR2022)

Language:Jupyter Notebook329 11 29

ovr-cnn

A new framework for open-vocabulary object detection, based on maskrcnn-benchmark

Language:PythonMIT215 5 28

knowhere

Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.

Language:C++Apache-2.0201 14 199

Text4Vis

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

Language:PythonMIT198 6 23

learning_minimal

Learning to Solve Hard Minimal Problems

Language:C++NOASSERTION141 6 5

MCQ

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Language:Python135 4 17

MUST

PyTorch code for MUST

Language:PythonBSD-3-Clause103 6 10

BootMAE

ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining

Language:Python96 3 2

everything_at_once

This is the official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022

Language:Python91 2 14

CLIP4CirDemo

[CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features

Language:SCSS70 2 4

LAVENDER

A Unified Framework for Video-Language Understanding

Language:PythonMIT55 16 7

met

A large-scale dataset for instance-level recognition for artworks is introduced.

Language:PythonMIT46 3 1

Universal-Transformer

Training Google Universal Image Embedding

Language:PythonMIT1 10