wanggrun

Guangrun Wang (王广润)'s repositories

Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookApache-2.0200

4D-Humans

4DHumans: Reconstructing and Tracking Humans with Transformers

Language:PythonMIT100

Depth-Anything

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Apache-2.0100

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

MIT000

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

CC-BY-4.0000

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookMIT000

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:Jupyter NotebookApache-2.0000

DIS

This is the repo for our new project Highly Accurate Dichotomous Image Segmentation

Apache-2.0000

FLatten-Transformer

Official repository of FLatten Transformer (ICCV2023)

Language:Python000

GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

Language:JavaScript000

humannerf

HumanNeRF turns a monocular video of moving people into a 360 free-viewpoint video.

Language:PythonMIT000

IDM-VTON

IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

000

inpaint-anything

Inpaint Anything performs stable diffusion inpainting on a browser UI using masks from Segment Anything.

Apache-2.0000

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookApache-2.0000

ladi-vton

This is the official repository for the paper "LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On".

Language:PythonNOASSERTION000

llama

Inference code for LLaMA models

Language:PythonNOASSERTION000

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonNOASSERTION000

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

NOASSERTION000

PeRF

[Technical Report 2023] PERF: Panoramic Neural Radiance Field from a Single Panorama

Language:Python000

pyllama

LLaMA: Open and Efficient Foundation Language Models

Language:PythonGPL-3.0000

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.0000

rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Language:PythonMIT000

StableVITON

NOASSERTION000

torch-ngp

A pytorch CUDA extension implementation of instant-ngp (sdf and nerf), with a GUI.

Language:PythonMIT000

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"

MIT000

ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions

Language:PythonApache-2.0000

wanggrun

Guangrun Wang (王广润)'s repositories

Grounded-Segment-Anything

wanggrun.github.io

4D-Humans

Depth-Anything

materials_discovery

AnyDoor

autogen

CLIP

dinov2

DIS

FLatten-Transformer

GPT-4V-Act

humannerf

IDM-VTON

im-server

inpaint-anything

IP-Adapter

ladi-vton

latent-diffusion-inpainting

llama

LLaMA2-Accessory

OOTDiffusion

PeRF

pyllama

pytorch-image-models-v2

rcg

StableVITON

torch-ngp

VAR

ViT-Adapter