aniki-ly

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Language:PythonApache-2.0000

VideoX

VideoX: a collection of video cross-modal models

Language:PythonNOASSERTION000

Gen-L-Video

The official implementation for "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising".

Language:Jupyter NotebookApache-2.0000

IF

Language:PythonNOASSERTION000

LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

000

MedSegDiff

Official implementation of paper "MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model"

Language:Python000

RPG-DiffusionMaster

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

000

SciencePlots

Matplotlib styles for scientific plotting

Language:PythonMIT000

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.0000

stable-diffusion

Language:Jupyter NotebookMIT000

viper

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

Language:Jupyter NotebookNOASSERTION000

visual-chatgpt

Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

Language:PythonMIT000