ltt-gddxz's starred repositories
Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
flamingo-pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
simple-HRNet
Multi-person Human Pose Estimation with HRNet in Pytorch
DeepFashion2
DeepFashion2 Dataset https://arxiv.org/pdf/1901.07973.pdf
yolov8-pytorch
这是一个yolov8-pytorch的仓库,可以用于训练自己的数据集。
alpaca-lora
Instruct-tune LLaMA on consumer hardware
Chinese-Vicuna
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
baidu-image-downloader
百度图片批量下载器
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
dense_flow
Tools to extract dense optical flow from videos, based on OpenCV
tsn-pytorch
Temporal Segment Networks (TSN) in PyTorch
tomatoclock
番茄工作法时钟
Simple-Baidu-Image-Download
只有30行的百度图片爬虫,只用最简单的语句
LibFewShot
LibFewShot: A Comprehensive Library for Few-shot Learning. TPAMI 2023.
Data_Label_Tools
收集整理开源的数据标注工具
youtube-8m
Code of PhoenixLin(3rd place) in the 2nd Youtube8M Video Understanding Challenge
ViT-pytorch
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)