thomas-yanxin

thomas-yanxin's starred repositories

OpenDevin

🐚 OpenDevin: Code Less, Make More

Language:PythonMIT25709 286 892

mlx

MLX: An array framework for Apple silicon

Language:C++MIT14822 134 417

SillyTavern

LLM Frontend for Power Users.

Language:JavaScriptAGPL-3.06146 51 1228

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Language:PythonApache-2.03945 54 145

edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Language:PythonGPL-3.03763 37 175

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Language:Jupyter NotebookMIT2334 31 149

InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Language:PythonApache-2.02098 32 73

fish-speech

Brand new TTS solution

Language:PythonBSD-3-Clause1792 32 134

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT1046 23 21

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonApache-2.0986 26 68

SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

Language:PythonNOASSERTION888 64 117

Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Language:PythonApache-2.0647 7 32

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonNOASSERTION598 3 53

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

Apache-2.0591 4 2

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks

Language:Python460 8 64

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.042000

databonsai

clean & curate your data with LLMs.

Language:PythonMIT406 2 2

RoleLLM-public

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

399 8 10

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Language:Python379 6 17

FuseAI

FuseLLM & FuseChat Project

Language:Python33900

mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

Language:PythonLGPL-3.0310 5 8

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonMIT252 26 12

LL3DA

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

Language:PythonMIT167 3 15

tc4d

TC4D: Trajectory-Conditioned Text-to-4D Generation

Language:PythonApache-2.0134 5 4

ShieldLM

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Language:PythonMIT71 4 7

multilingual-safety-for-LLMs

[ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models"

MIT42 70

UltraLink

An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset

Language:PythonMIT13 70