steinate

笔移云误's starred repositories

osu-dreamer

a diffusion-based ML model for generating osu! maps from raw audio

Language:PythonMIT11500

LLaVA-3D

A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World

Language:Python10000

dreamerv3

Mastering Diverse Domains through World Models

Language:PythonMIT129900

MonoGS

[CVPR'24 Highlight & Best Demo Award] Gaussian Splatting SLAM

Language:PythonNOASSERTION128500

gcd

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation

Language:PythonGPL-3.014600

HarmonyDream

Code release for "HarmonyDream: Task Harmonization Inside World Models" (ICML 2024), https://arxiv.org/abs/2310.00344

Language:PythonMIT2300

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT3342800

planet

Learning Latent Dynamics for Planning from Pixels

Language:PythonApache-2.0117300

street_gaussians

[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Language:PythonNOASSERTION81700

SAM2Point

The Most Faithful Implementation of Segment Anything (SAM) in 3D

Language:PythonApache-2.026200

drivestudio

A 3DGS framework for omni urban scene reconstruction and simulation.

Language:PythonMIT48600

rrt-algorithms

n-dimensional RRT, RRT* (RRT-Star)

Language:PythonMIT60600

BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

Language:PythonApache-2.041600

Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Language:Cuda17200

BreezeShane.github.io

My own private blog.

Language:TypeScript600

large-scale-curiosity

Code for the paper "Large-Scale Study of Curiosity-Driven Learning"

Language:Python80300

OmniNxt

[IROS'24 Oral] A Fully Open-source and Compact Aerial Robot with Omnidirectional Visual Perception

GPL-3.026100

CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonApache-2.0781900

StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Language:PythonApache-2.0951400

ORB_SLAM3

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Language:C++GPL-3.0644700

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonAGPL-3.0494800

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Language:PythonAGPL-3.01176500

carla

Open-source simulator for autonomous driving research.

Language:C++MIT1117300

S5

Language:PythonMIT25200

LAUDNet

[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition

Language:Jupyter Notebook4000

diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Language:PythonMIT138600

AnimateLCM

[SIGGRAPH ASIA 2024 TCS] AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data

Language:PythonMIT57900

weighted-likelihood-filter

Code for the paper "Outlier-robust Kalman Filtering through Generalised Bayes" presented at ICML 2024

Language:Jupyter Notebook6000

GaussianSplats3D

Three.js-based implementation of 3D Gaussian splatting

Language:JavaScriptMIT137300

OVExp

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation

2800