Shuyang Sun (kevin-ssy)

kevin-ssy

Geek Repo

Company:University of Oxford

Location:Oxford, United Kingdom

Home Page:https://kevin-ssy.github.io/

Github PK Tool:Github PK Tool


Organizations
torrvision

Shuyang Sun's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46208Issues:305Issues:658

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14509Issues:115Issues:380

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Language:PythonLicense:NOASSERTIONStargazers:7077Issues:58Issues:187

InternLM

Official release of InternLM2.5 base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:6065Issues:54Issues:321

ConvNeXt

Code release for ConvNeXt model

Language:PythonLicense:MITStargazers:5678Issues:33Issues:128

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:3924Issues:114Issues:74

ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)

Language:PythonLicense:Apache-2.0Stargazers:2811Issues:20Issues:276

mmdeploy

OpenMMLab Model Deployment Framework

Language:PythonLicense:Apache-2.0Stargazers:2661Issues:37Issues:1576

MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Language:PythonLicense:Apache-2.0Stargazers:1232Issues:50Issues:31

OMG-Seg

OMG-LLaVA and OMG-Seg codebase

Language:PythonLicense:NOASSERTIONStargazers:1181Issues:23Issues:28

DragDiffusion

[CVPR2024, Highlight] Official code for DragDiffusion

Language:PythonLicense:Apache-2.0Stargazers:1114Issues:26Issues:63

FateZero

[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"

Language:Jupyter NotebookLicense:MITStargazers:1084Issues:14Issues:33

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:851Issues:19Issues:106

ScaleCrafter

[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.

PointLLM

[ECCV 2024 Oral] PointLLM: Empowering Large Language Models to Understand Point Clouds

TokenCut

(CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"

Language:Jupyter NotebookLicense:MITStargazers:295Issues:7Issues:15

fc-clip

[NeurIPS 2023] This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

Language:PythonLicense:Apache-2.0Stargazers:269Issues:16Issues:33

DDQ

Dense Distinct Query for End-to-End Object Detection (CVPR2023)

Language:PythonLicense:Apache-2.0Stargazers:243Issues:9Issues:21

SyntheticData

Is synthetic data from generative models ready for image recognition?

Language:PythonLicense:Apache-2.0Stargazers:171Issues:13Issues:9

TransMix

[CVPR 2022] This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Language:PythonLicense:Apache-2.0Stargazers:158Issues:11Issues:19

UniHSI

[ICLR 2024 Spotlight] Unified Human-Scene Interaction via Prompted Chain-of-Contacts

PartImageNet

Introduction and scripts for the paper "PartImageNet: A Large, High-Quality Dataset of Parts" (Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Alan Yuille).

AVION

Code release for "Training a Large Video Model on a Single Machine in a Day"

Language:PythonLicense:MITStargazers:105Issues:1Issues:10
Language:PythonLicense:MITStargazers:95Issues:2Issues:6

Training-Data-Synthesis

[ICLR 2024] Real-Fake: Effective Training Data Synthesis Through Distribution Matching

Language:PythonLicense:MITStargazers:69Issues:3Issues:4

kmax-deeplab

a PyTorch re-implementation of ECCV 2022 paper based on Detectron2: k-means mask Transformer.

Language:PythonLicense:Apache-2.0Stargazers:65Issues:7Issues:3

KDEP

(CVPR2022) Official PyTorch Implementation of KDEP. Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability

Language:PythonLicense:Apache-2.0Stargazers:62Issues:2Issues:2

RAG-Driver

A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-to-end driving

Language:PythonLicense:Apache-2.0Stargazers:58Issues:8Issues:5

IMProv

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

Oxford_HIC

A large-scale humour-oriented image text dataset

Language:PythonLicense:MITStargazers:8Issues:2Issues:0