Wentao Zhu's repositories
AnatomyNet-for-anatomical-segmentation
AnatomyNet: Deep 3D Squeeze-and-excitation U-Nets for fast and fully automated whole-volume anatomical segmentation
adversarial-deep-structural-networks
ISBI2018: Adversarial Deep Structural Networks for Mammographic Mass Segmentation https://arxiv.org/abs/1612.05970
leetcode-master
LeetCode å·é¢ę»ē„ļ¼200éē»å øé¢ē®å·é¢é”ŗåŗļ¼å ±60wåēčƦē»å¾č§£ļ¼č§é¢é¾ē¹åęļ¼50ä½å¼ ęē»“åƼå¾ļ¼ęÆęC++ļ¼Javaļ¼Pythonļ¼Goļ¼JavaScriptēå¤čÆčØēę¬ļ¼ä»ę¤ē®ę³å¦ä¹ äøåčæ·č«ļ¼š„š„ ę„ēēļ¼ä½ ä¼åē°ēøč§ęØęļ¼š
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
asv-subtools
An Open Source Tools for Speaker Recognition
ccf_2020_qa_match
ccf 2020 qa match competition top1
CLIP
Contrastive Language-Image Pretraining
D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022)
Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Det3D
World's first general purpose 3D object detection codebse.
Few-shot-NAS
The official repo for Few-Shot Neural Architecture Search (ICML'21 long oral)
flamingo-pytorch
Implementation of š¦© Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
GEBD
Generic Event Boundary Detection: A Benchmark for Event Segmentation
machine-learning-systems-design
A booklet on machine learning systems design with exercises
manning
Repository for the book Grokking Machine Learning, by Manning Editors
mmt
Multi-Modal Transformer for Video Retrieval
Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
UniFormer
[ICLR2022] official implementation of UniFormer
vision
Datasets, Transforms and Models specific to Computer Vision
ViT-pytorch
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
vit-pytorch-1
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
voxceleb_trainer
In defence of metric learning for speaker recognition
wav2tok
Codebase for ICLR' 23 paper- ''Wav2Tok: Deep Sequence Tokenizer for Audio Retrieval"