Beast code in Giters

Feng Zheng's repositories

Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Language:PythonMIT100

Ads-1k

Dataset with 1000+ video advertisemets proposed by "Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward" (ACCV 2022)

Language:PythonBSD-2-Clause000

AnomalyDetection-SoftPatch

Code for NeurIPS 2022 paper "SoftKernel: Unsupervised Anomaly Detection with Noisy Data"

Language:Python000

Audio-Visual-Figure-Skating

Language:Python000

awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly detection (defect detection). 工业异常检测(瑕疵检测)论文及数据集检索库。

000

awesome-multimodal-brain-image-systhesis

000

Awesome_Prompting_Papers_in_Computer_Vision

A curated list of prompt-based paper in computer vision and vision-language learning.

000

Caption-Anything

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.

Language:PythonBSD-3-Clause000

Continual_Anomaly_Detection

Language:Python000

FedMed-GAN

Language:Python000

FLM

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)

000

LaunchpadGPT

Repo for ICMC 2023 paper: LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad

MIT000

Open-Lyrics

Transcribe (whisper) and translate (gpt) voice into LRC file. 使用whisper和gpt来转录、翻译你的音频文件为LRC歌词文件。

MIT000

PDVC

End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

Language:PythonMIT000

UnAV

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)

MIT000

VLMixer

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)

BSD-3-Clause000