sosppxo

sosppxo's starred repositories

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT192600

JerryYin777.github.io

My Academic Website：https://jerrysys.top

Language:CSSMIT600

mamba-code-explained

Language:Cuda1500

UniSeg3D

A Unified Framework for 3D Scene Understanding

Language:PythonApache-2.06300

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonNOASSERTION256500

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonApache-2.034400

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT439900

Open3DSG

Code for CVPR 2024 paper: Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

Language:PythonAGPL-3.03100

LongVA

Long Context Transfer from Language to Vision

Language:PythonApache-2.024900

VQGAN-LC

Language:Python7500

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION643400

SpeechGPT

SpeechGPT Series: Speech Large Language Models

Language:PythonApache-2.0109300

GOV-NeSF

700

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonGPL-3.0404800

semantic-gaussians

Official implemetation of the paper "Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting".

Language:PythonMIT8100

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonNOASSERTION159000

Remote-Sensing-in-CVPR2024

Papers related to remote sensing in CVPR 2024

9600

SeaBird

[CVPR 2024] Official PyTorch Code of SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

Language:PythonMIT7300

CityRefer

Language:PythonMIT3600

X-3D

X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition (CVPR2024)

Language:PythonMIT1600

expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，**类，数学类，人物传记书籍

637000

Replica-Dataset

The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .

Language:C++NOASSERTION95300

concept-graphs

Official code release for ConceptGraphs

Language:PythonMIT32800

PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)

Language:PythonMIT64300

octformer

OctFormer: Octree-based Transformers for 3D Point Clouds

Language:PythonMIT22700

sd-webui-EasyPhoto

📷 EasyPhoto | Your Smart AI Photo Generator.

Language:PythonApache-2.0479900

VideoAgent

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

Language:PythonApache-2.06100

MLLM-Tool

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Language:PythonMIT8900

tree

tree is a library for working with nested data structures

Language:PythonApache-2.092400

Open3DIS

Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)

Language:PythonApache-2.05100