蓋瑞王's repositories
anything-llm
A multi-user ChatGPT for any LLMs and vector database. Unlimited documents, messages, and storage in one privacy-focused app. Now available as a desktop application!
ChatTTS
A generative speech model for daily dialogue.
gpu-jupyter
Leverage the flexibility of Jupyterlab through the power of your NVIDIA GPU to run your code from Tensorflow and Pytorch in collaborative notebooks on the GPU.
AutoDetect
Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
delta-iris
Efficient World Models with Context-Aware Tokenization. ICML 2024
dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
diffusion-forcing
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
diffusion-forcing-transformer
Transformer implementation for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
dspy
DSPy: The framework for programming—not prompting—foundation models
EvTexture
[ICML 2024] EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
faster-whisper
Faster Whisper transcription with CTranslate2
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
make-it-count
Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects"
MM-NIAH
This is the official implementation of the paper "Needle In A Multimodal Haystack"
MotionBooth
The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
planetarium
Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
RVT
Official Code for RVT: Robotic View Transformer for 3D Object Manipulation
SenseVoice
Multilingual Voice Understanding Model
Taiwan-LLM
Traditional Mandarin LLMs for Taiwan
TalkTuner-chatbot-llm-dashboard
Designing a Dashboard for Transparency and Control of Conversational AI, https://arxiv.org/abs/2406.07882
xland-minigrid-datasets
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning