Mr.Li (bobo0810)

bobo0810

Geek Repo

Company:North University of China

Location:Beijing

Github PK Tool:Github PK Tool

Mr.Li's starred repositories

hello-algo

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing

Language:JavaLicense:NOASSERTIONStargazers:98241Issues:537Issues:226

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:20203Issues:157Issues:1529

AISystem

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:11160Issues:152Issues:37

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonLicense:GPL-3.0Stargazers:8978Issues:56Issues:536

Yi

A series of large language models trained from scratch by developers @01-ai

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7691Issues:107Issues:290

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7121Issues:76Issues:210

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:6923Issues:64Issues:1175

LeetCode-Book

《剑指 Offer》 Python, Java, C++ 解题代码,LeetBook《图解算法数据结构》配套代码仓

Language:JavaLicense:NOASSERTIONStargazers:6310Issues:49Issues:7

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:6017Issues:57Issues:629

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonLicense:MITStargazers:5991Issues:52Issues:605

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonLicense:GPL-3.0Stargazers:4654Issues:39Issues:454

Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonLicense:Apache-2.0Stargazers:2981Issues:28Issues:185
Language:PythonLicense:Apache-2.0Stargazers:2852Issues:33Issues:299

cv_note

记录cv算法工程师的成长之路,分享计算机视觉和模型压缩部署技术栈笔记。https://harleyszhang.github.io/cv_note/

Language:PythonLicense:Apache-2.0Stargazers:2417Issues:31Issues:4

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:2071Issues:19Issues:47

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1979Issues:24Issues:92
Language:PythonLicense:Apache-2.0Stargazers:1764Issues:120Issues:22

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:1321Issues:10Issues:209

MobileVLM

Strong and Open Vision Language Assistant for Mobile Devices

Language:PythonLicense:Apache-2.0Stargazers:1038Issues:21Issues:57

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:930Issues:19Issues:120

LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Language:PythonLicense:MITStargazers:719Issues:15Issues:62

MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Language:PythonLicense:BSD-3-ClauseStargazers:553Issues:12Issues:40

ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

VisionLLaMA

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks

NeteaseTVDemo

NeteaseTVDemo (Vibefy) - tvOS 客户端

Language:SwiftLicense:GPL-2.0Stargazers:271Issues:7Issues:16

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonLicense:Apache-2.0Stargazers:244Issues:11Issues:11

awesome-mm-chat

多模态 MM +Chat 合集

DenseFusion

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

DataOptim

A collection of visual instruction tuning datasets.

Language:PythonLicense:MITStargazers:76Issues:5Issues:0