sosppxo

sosppxo

Geek Repo

Company:Xiamen University

Github PK Tool:Github PK Tool

sosppxo's starred repositories

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1926Issues:0Issues:0

JerryYin777.github.io

My Academic Website:https://jerrysys.top

Language:CSSLicense:MITStargazers:6Issues:0Issues:0
Language:CudaStargazers:15Issues:0Issues:0

UniSeg3D

A Unified Framework for 3D Scene Understanding

Language:PythonLicense:Apache-2.0Stargazers:63Issues:0Issues:0

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonLicense:NOASSERTIONStargazers:2565Issues:0Issues:0

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonLicense:Apache-2.0Stargazers:344Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4399Issues:0Issues:0

Open3DSG

Code for CVPR 2024 paper: Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

Language:PythonLicense:AGPL-3.0Stargazers:31Issues:0Issues:0

LongVA

Long Context Transfer from Language to Vision

Language:PythonLicense:Apache-2.0Stargazers:249Issues:0Issues:0
Language:PythonStargazers:75Issues:0Issues:0

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:6434Issues:0Issues:0

SpeechGPT

SpeechGPT Series: Speech Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:1093Issues:0Issues:0
Stargazers:7Issues:0Issues:0

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonLicense:GPL-3.0Stargazers:4048Issues:0Issues:0

semantic-gaussians

Official implemetation of the paper "Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting".

Language:PythonLicense:MITStargazers:81Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1590Issues:0Issues:0

Remote-Sensing-in-CVPR2024

Papers related to remote sensing in CVPR 2024

Stargazers:96Issues:0Issues:0

SeaBird

[CVPR 2024] Official PyTorch Code of SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

Language:PythonLicense:MITStargazers:73Issues:0Issues:0
Language:PythonLicense:MITStargazers:36Issues:0Issues:0

X-3D

X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition (CVPR2024)

Language:PythonLicense:MITStargazers:16Issues:0Issues:0

expert_readed_books

2021年最新总结,推荐工程师合适读本,计算机科学,软件技术,创业,**类,数学类,人物传记书籍

Stargazers:6370Issues:0Issues:0

Replica-Dataset

The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .

Language:C++License:NOASSERTIONStargazers:953Issues:0Issues:0

concept-graphs

Official code release for ConceptGraphs

Language:PythonLicense:MITStargazers:328Issues:0Issues:0

PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)

Language:PythonLicense:MITStargazers:643Issues:0Issues:0

octformer

OctFormer: Octree-based Transformers for 3D Point Clouds

Language:PythonLicense:MITStargazers:227Issues:0Issues:0

sd-webui-EasyPhoto

📷 EasyPhoto | Your Smart AI Photo Generator.

Language:PythonLicense:Apache-2.0Stargazers:4799Issues:0Issues:0

VideoAgent

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

Language:PythonLicense:Apache-2.0Stargazers:61Issues:0Issues:0

MLLM-Tool

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Language:PythonLicense:MITStargazers:89Issues:0Issues:0

tree

tree is a library for working with nested data structures

Language:PythonLicense:Apache-2.0Stargazers:924Issues:0Issues:0

Open3DIS

Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)

Language:PythonLicense:Apache-2.0Stargazers:51Issues:0Issues:0