liury889's starred repositories

mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.

Language:PythonLicense:Apache-2.0Stargazers:5574Issues:0Issues:0

LLM101n

LLM101n: Let's build a Storyteller

Stargazers:28163Issues:0Issues:0

ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Language:ShellLicense:Apache-2.0Stargazers:25044Issues:0Issues:0

XAgent

An Autonomous LLM Agent for Complex Task Solving

Language:PythonLicense:Apache-2.0Stargazers:8021Issues:0Issues:0

tutorials

PyTorch tutorials.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:8097Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26055Issues:0Issues:0

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15786Issues:0Issues:0

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:10297Issues:0Issues:0

S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Language:PythonLicense:Apache-2.0Stargazers:1696Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:5879Issues:0Issues:0

awesome-multi-agent-papers

A compilation of the best multi-agent papers

Stargazers:159Issues:0Issues:0

HolmesVAD

Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"

Language:PythonLicense:MITStargazers:62Issues:0Issues:0

LocLLM

Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight

Language:PythonLicense:MITStargazers:26Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4753Issues:0Issues:0

Pink

Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs

Language:PythonStargazers:72Issues:0Issues:0

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:11784Issues:0Issues:0
Language:PythonLicense:MITStargazers:4576Issues:0Issues:0

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:16309Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:639Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:11668Issues:0Issues:0

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonLicense:Apache-2.0Stargazers:1953Issues:0Issues:0

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptLicense:NOASSERTIONStargazers:44994Issues:0Issues:0

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1148Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:7069Issues:0Issues:0

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1905Issues:0Issues:0

GroundingGPT

[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model

Language:PythonLicense:Apache-2.0Stargazers:284Issues:0Issues:0

Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:753Issues:0Issues:0

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonLicense:Apache-2.0Stargazers:2835Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49411Issues:0Issues:0

VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Language:PythonLicense:NOASSERTIONStargazers:4460Issues:0Issues:0