Misby's repositories
llm_kvcache_sparsity
Implement some method of LLM KV Cache Sparsity
MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level MLLM on Your Phone
libpfm4
This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branch for developing patches.
catapult
Deprecated Catapult GitHub. Please instead use http://crbug.com "Speed>Benchmarks" component for bugs and https://chromium.googlesource.com/catapult for downloading and editing source code..
llamafile
Distribute and run LLMs with a single file.
LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding
FastGPT
FastGPT is a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization!
MCSD
Multi-Candidate Speculative Decoding
hpipm
High-performance interior-point-method QP and QCQP solvers
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
agi
Android GPU Inspector
llama2.c
Inference Llama 2 in one file of pure C
GPy
Gaussian processes framework in python
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MegEngine
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
examples
TensorFlow examples
TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and per
coder-kung-fu
开发内功修炼
transformers-android-demo
📲 Transformers android examples (Tensorflow Lite & Pytorch Mobile)