TheShadow29

StableSwarmUI, A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.

Language:C#MIT2473 47 257

zero123

Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)

Language:PythonMIT2470 42 116

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

2397 114 15

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookApache-2.02396 25 132

make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

Language:PythonMIT1833 72 15

DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Language:PythonNOASSERTION1596 23 210

CLIP_prefix_caption

Simple image captioning model

Language:Jupyter NotebookMIT1197 7 76

LLMStack

No-code platform to build LLM Agents, workflows and applications with your data

Language:PythonNOASSERTION1082 16 37

LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

687 48 6

ov-seg

This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.

Language:Jupyter NotebookNOASSERTION622 11 28

plotai

PlotAI - Your Ultimate Plotting Assistant! 📊🤖 Use ChatGPT-3.5 to create plots in Python and Matplotlib directly in your Python script or notebook.

Language:PythonApache-2.0284 4 3

diffusion_reading_group

Diffusion Reading Group at EleutherAI

Language:Jupyter Notebook284 23 1

Pseudo-Q

[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

Language:PythonApache-2.0137 3 18

X2-VLM

All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)

Language:PythonBSD-3-Clause112 6 16

Playground

Text WebUI extension to add clever Notebooks to Chat mode

Language:Python104 7 11

attention-refocusing

Language:PythonMIT84 1 8

VL-PET

[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"

Language:PythonMIT47 2 4

VisIT-Bench

Language:Python44 8 3

economist_poll

Which Famous Economist Are You Most Similar To? Data from the IGM expert panel poll and code for extracting it.

Language:Jupyter NotebookNOASSERTION30 60

eventful-transformer

Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"

Language:PythonMIT29 4 3

ORES

ORES: Open-vocabulary Responsible Visual Synthesis

Language:PythonMIT1100