Feng Zheng's repositories
Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Ads-1k
Dataset with 1000+ video advertisemets proposed by "Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward" (ACCV 2022)
AnomalyDetection-SoftPatch
Code for NeurIPS 2022 paper "SoftKernel: Unsupervised Anomaly Detection with Noisy Data"
awesome-industrial-anomaly-detection
Paper list and datasets for industrial image anomaly detection (defect detection). 工业异常检测(瑕疵检测)论文及数据集检索库。
Awesome_Prompting_Papers_in_Computer_Vision
A curated list of prompt-based paper in computer vision and vision-language learning.
Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.
FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
LaunchpadGPT
Repo for ICMC 2023 paper: LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad
Open-Lyrics
Transcribe (whisper) and translate (gpt) voice into LRC file. 使用whisper和gpt来转录、翻译你的音频文件为LRC歌词文件。
PDVC
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
VLMixer
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)