ChaimZhu (ZCMax)

ZCMax

Geek Repo

Company:HKU IDS | HKU-MMLab

Location:Hong Kong SAR

Github PK Tool:Github PK Tool

ChaimZhu's starred repositories

openvla

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Language:PythonLicense:MITStargazers:369Issues:0Issues:0

acad-homepage.github.io

AcadHomepage: A Modern and Responsive Academic Personal Homepage

Language:SCSSLicense:MITStargazers:995Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:70Issues:0Issues:0

gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

Language:JavaScriptStargazers:686Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:10520Issues:0Issues:0
Language:PythonStargazers:38Issues:0Issues:0
Language:PythonStargazers:788Issues:0Issues:0
Language:CSSLicense:MITStargazers:107Issues:0Issues:0

ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

License:MITStargazers:12641Issues:0Issues:0

Stratified-Transformer

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Language:PythonLicense:MITStargazers:350Issues:0Issues:0

rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

Language:RustLicense:Apache-2.0Stargazers:5550Issues:0Issues:0

PLLaVA

Official repository for the paper PLLaVA

Language:PythonStargazers:432Issues:0Issues:0

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:PythonStargazers:718Issues:0Issues:0

omnidata

A Scalable Pipeline for Making Steerable Multi-Task Mid-Level Vision Datasets from 3D Scans [ICCV 2021]

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:372Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:27629Issues:0Issues:0

3D-VLA

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Language:PythonStargazers:199Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22229Issues:0Issues:0

probe3d

[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models

Language:PythonLicense:MITStargazers:211Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:8160Issues:0Issues:0

open-eqa

OpenEQA Embodied Question Answering in the Era of Foundation Models

Language:Jupyter NotebookLicense:MITStargazers:173Issues:0Issues:0
Language:PythonLicense:MITStargazers:118Issues:0Issues:0

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

Stargazers:849Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3053Issues:0Issues:0

LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Language:PythonLicense:Apache-2.0Stargazers:649Issues:0Issues:0

VQASynth

Compose multimodal datasets 🎹

Language:PythonStargazers:109Issues:0Issues:0

multi_token

Embed arbitrary modalities (images, audio, documents, etc) into large language models.

Language:PythonLicense:Apache-2.0Stargazers:154Issues:0Issues:0

Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:688Issues:0Issues:0

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:565Issues:0Issues:0

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookLicense:MITStargazers:23197Issues:0Issues:0

act3d-chained-diffuser

A unified architecture for multimodal multi-task robotic policy learning.

Language:PythonStargazers:91Issues:0Issues:0