ltt-gddxz

followers

following

stars

ltt-gddxz's starred repositories

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonApache-2.0262400

flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Language:PythonMIT115900

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

VindLU

Language:PythonMIT9400

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookBSD-3-Clause441100

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonApache-2.01077700

YOLOv8-pt

YOLOv8 implementation using PyTorch

Language:PythonAGPL-3.07500

simple-HRNet

Multi-person Human Pose Estimation with HRNet in Pytorch

Language:PythonGPL-3.056100

DeepFashion2

DeepFashion2 Dataset https://arxiv.org/pdf/1901.07973.pdf

Language:Jupyter Notebook219300

kinetics-dataset

Language:Shell70700

regnet

Pytorch implementation of network design paradigm described in the paper "Designing Network Design Spaces"

Language:PythonMIT18200

yolov8-pytorch

这是一个yolov8-pytorch的仓库，可以用于训练自己的数据集。

Language:PythonGPL-3.051700

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookApache-2.01834100

Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

Language:CApache-2.0414000

llama

Inference code for Llama models

Language:PythonNOASSERTION5395300

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonBSD-3-Clause143400

BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

Language:HTMLApache-2.0766900

T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Language:Jupyter NotebookNOASSERTION112500

baidu-image-downloader

百度图片批量下载器

Language:Python2300

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.03043500

vision_transformer

Language:Jupyter NotebookApache-2.0962800

dense_flow

Tools to extract dense optical flow from videos, based on OpenCV

Language:C++MIT24600

tsn-pytorch

Temporal Segment Networks (TSN) in PyTorch

Language:PythonBSD-2-Clause105700

tomatoclock

番茄工作法时钟

Language:PythonGPL-3.03100

Simple-Baidu-Image-Download

只有30行的百度图片爬虫，只用最简单的语句

Language:PythonMIT1000

LibFewShot

LibFewShot: A Comprehensive Library for Few-shot Learning. TPAMI 2023.

Language:PythonMIT86000

pycls

Codebase for Image Classification Research, written in PyTorch.

Language:PythonMIT211800

Data_Label_Tools

收集整理开源的数据标注工具

youtube-8m

Code of PhoenixLin(3rd place) in the 2nd Youtube8M Video Understanding Challenge

Language:PythonApache-2.020500

ViT-pytorch

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Language:Jupyter NotebookMIT183300