bobzhang123's starred repositories

Awesome-LLM-Robotics

A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites

License:BSD-3-ClauseStargazers:2598Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:10996Issues:0Issues:0

segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8154Issues:0Issues:0

GeMap

[ECCV'24] Online Vectorized HD Map Construction using Geometry

Language:PythonLicense:Apache-2.0Stargazers:173Issues:0Issues:0

RoGS

RoGS: Large Scale Road Surface Reconstruction based on 2D Gaussian Splatting

Language:PythonLicense:Apache-2.0Stargazers:22Issues:0Issues:0

APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language:PythonLicense:Apache-2.0Stargazers:465Issues:0Issues:0

2D-GS-Viser-Viewer

Simple Viser Viewer for 2D Gaussian Splatting for Geometrically Accurate Radiance Fields

Language:PythonStargazers:82Issues:0Issues:0

LEGaussians

Pytorch Code for "LEGaussians: Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding"

Language:PythonLicense:MITStargazers:94Issues:0Issues:0

FSGS

[ECCV 2024]"FSGS: Real-Time Few-Shot View Synthesis using Gaussian Splatting", Zehao Zhu*, Zhiwen Fan*, Yifan Jiang, Zhangyang Wang

Language:PythonLicense:NOASSERTIONStargazers:332Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8929Issues:0Issues:0

gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Language:PythonLicense:NOASSERTIONStargazers:13051Issues:0Issues:0

VMA

A general map auto annotation framework based on MapTR, with high flexibility in terms of spatial scale and element type

Language:PythonLicense:MITStargazers:189Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1623Issues:0Issues:0

S3Gaussian

Official Implementation of Self-Supervised Street Gaussians for Autonomous Driving

Language:PythonLicense:NOASSERTIONStargazers:350Issues:0Issues:0

awesome-scene-understanding

😎 A list of awesome scene understanding papers.

License:MITStargazers:673Issues:0Issues:0

Switch-NeRF

Codes for Switch-NeRF (ICLR 2023)

Language:PythonLicense:MITStargazers:191Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:PythonLicense:MITStargazers:52484Issues:0Issues:0

2d-gaussian-splatting

[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields

Language:PythonLicense:NOASSERTIONStargazers:1753Issues:0Issues:0

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:811Issues:0Issues:0

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:3455Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4595Issues:0Issues:0
Language:PythonStargazers:1425Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18522Issues:0Issues:0
Language:PythonStargazers:204Issues:0Issues:0

CLIM

[AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation

Language:PythonLicense:NOASSERTIONStargazers:25Issues:0Issues:0

MQ-Det

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)

Language:PythonLicense:Apache-2.0Stargazers:252Issues:0Issues:0

Vista

A Generalizable World Model for Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:426Issues:0Issues:0

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

License:MITStargazers:3200Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4483Issues:0Issues:0

Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

Stargazers:76Issues:0Issues:0