ltt-gddxz's starred repositories

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonLicense:Apache-2.0Stargazers:2624Issues:0Issues:0

flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Language:PythonLicense:MITStargazers:1159Issues:0Issues:0

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

Stargazers:847Issues:0Issues:0
Language:PythonLicense:MITStargazers:94Issues:0Issues:0

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4411Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10777Issues:0Issues:0

YOLOv8-pt

YOLOv8 implementation using PyTorch

Language:PythonLicense:AGPL-3.0Stargazers:75Issues:0Issues:0

simple-HRNet

Multi-person Human Pose Estimation with HRNet in Pytorch

Language:PythonLicense:GPL-3.0Stargazers:561Issues:0Issues:0

DeepFashion2

DeepFashion2 Dataset https://arxiv.org/pdf/1901.07973.pdf

Language:Jupyter NotebookStargazers:2193Issues:0Issues:0
Language:ShellStargazers:707Issues:0Issues:0

regnet

Pytorch implementation of network design paradigm described in the paper "Designing Network Design Spaces"

Language:PythonLicense:MITStargazers:182Issues:0Issues:0

yolov8-pytorch

这是一个yolov8-pytorch的仓库,可以用于训练自己的数据集。

Language:PythonLicense:GPL-3.0Stargazers:517Issues:0Issues:0

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18341Issues:0Issues:0

Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca

Language:CLicense:Apache-2.0Stargazers:4140Issues:0Issues:0

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:53953Issues:0Issues:0

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonLicense:BSD-3-ClauseStargazers:1434Issues:0Issues:0

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:7669Issues:0Issues:0

T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1125Issues:0Issues:0

baidu-image-downloader

百度图片批量下载器

Language:PythonStargazers:23Issues:0Issues:0

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:30435Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9628Issues:0Issues:0

dense_flow

Tools to extract dense optical flow from videos, based on OpenCV

Language:C++License:MITStargazers:246Issues:0Issues:0

tsn-pytorch

Temporal Segment Networks (TSN) in PyTorch

Language:PythonLicense:BSD-2-ClauseStargazers:1057Issues:0Issues:0

tomatoclock

番茄工作法时钟

Language:PythonLicense:GPL-3.0Stargazers:31Issues:0Issues:0

Simple-Baidu-Image-Download

只有30行的百度图片爬虫,只用最简单的语句

Language:PythonLicense:MITStargazers:10Issues:0Issues:0

LibFewShot

LibFewShot: A Comprehensive Library for Few-shot Learning. TPAMI 2023.

Language:PythonLicense:MITStargazers:860Issues:0Issues:0

pycls

Codebase for Image Classification Research, written in PyTorch.

Language:PythonLicense:MITStargazers:2118Issues:0Issues:0

Data_Label_Tools

收集整理开源的数据标注工具

Stargazers:743Issues:0Issues:0

youtube-8m

Code of PhoenixLin(3rd place) in the 2nd Youtube8M Video Understanding Challenge

Language:PythonLicense:Apache-2.0Stargazers:205Issues:0Issues:0

ViT-pytorch

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Language:Jupyter NotebookLicense:MITStargazers:1833Issues:0Issues:0