tranhoangkim's starred repositories
VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
unifying-global-local-feature
A deep learning model to spot the actions in soccer videos by unifying global and local features