Saining Xie (s9xie)

s9xie

Geek Repo

Company:Courant Institute of Mathematical Sciences, New York University

Location:NYC

Home Page:sainingxie.com

Twitter:@sainingxie

Github PK Tool:Github PK Tool

Saining Xie's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45036Issues:298Issues:648

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:33874Issues:354Issues:298

ControlNet

Let us control diffusion models!

Language:PythonLicense:Apache-2.0Stargazers:28658Issues:214Issues:523

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13927Issues:114Issues:368

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

ConvNeXt

Code release for ConvNeXt model

Language:PythonLicense:MITStargazers:5602Issues:33Issues:130

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5485Issues:46Issues:73

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonLicense:MITStargazers:3381Issues:30Issues:249

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonLicense:Apache-2.0Stargazers:3089Issues:40Issues:238

zero123

Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)

Language:PythonLicense:MITStargazers:2548Issues:43Issues:120

Painter

Painter & SegGPT Series: Vision Foundation Models from BAAI

Language:PythonLicense:MITStargazers:2455Issues:36Issues:66

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:AGPL-3.0Stargazers:2439Issues:41Issues:0

SuGaR

[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Language:C++License:NOASSERTIONStargazers:1828Issues:59Issues:195

X-Decoder

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Language:PythonLicense:Apache-2.0Stargazers:1268Issues:34Issues:65

deepscatter

Zoomable, animated scatterplots in the browser that scales over a billion points

Language:TypeScriptLicense:NOASSERTIONStargazers:989Issues:15Issues:58

MCC

Multiview Compressive Coding for 3D Reconstruction

Language:PythonLicense:NOASSERTIONStargazers:617Issues:14Issues:20

SiT

Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"

Language:PythonLicense:MITStargazers:533Issues:10Issues:17
Language:PythonLicense:Apache-2.0Stargazers:484Issues:12Issues:17

ott

Optimal transport tools implemented with the JAX framework, to get differentiable, parallel and jit-able computations.

Language:PythonLicense:Apache-2.0Stargazers:467Issues:10Issues:178

vstar

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Language:PythonLicense:MITStargazers:452Issues:10Issues:14

flip

Official Open Source code for "Scaling Language-Image Pre-training via Masking"

Language:PythonLicense:NOASSERTIONStargazers:383Issues:8Issues:2

VIRL

Code for V-IRL: Grounding Virtual Intelligence in Real Life

just-the-class

A modern, highly customizable, responsive Jekyll template for course websites.

Language:SCSSLicense:MITStargazers:268Issues:6Issues:12

mix3d

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021 Oral)

long_seq_mae

code release of research paper "Exploring Long-Sequence Masked Autoencoders"

Language:PythonLicense:NOASSERTIONStargazers:98Issues:6Issues:3
Language:PythonStargazers:4Issues:0Issues:0

Hate-LLaMA

An Instruction-tuned Audio-Visual Language Model for Hate Content Detection

Language:PythonLicense:BSD-3-ClauseStargazers:1Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0
Language:Jupyter NotebookStargazers:1Issues:0Issues:0