Nianhui Guo (NicoNico6)

NicoNico6

Geek Repo

Company:Hasso Plattner Institute (HPI)

Location:Potsdam, German

Github PK Tool:Github PK Tool

Nianhui Guo's repositories

Language:PythonStargazers:1Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

bitorch-engine

A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf

License:Apache-2.0Stargazers:0Issues:0Issues:0

Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

License:MITStargazers:0Issues:0Issues:0

BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

License:MITStargazers:0Issues:0Issues:0

buffer-of-thought-llm

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Stargazers:0Issues:0Issues:0

ETO

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents

Stargazers:0Issues:0Issues:0

evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes

License:Apache-2.0Stargazers:0Issues:0Issues:0

fast-hadamard-transform

Fast Hadamard transform in CUDA, with a PyTorch interface

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

License:Apache-2.0Stargazers:0Issues:0Issues:0

KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

License:MITStargazers:0Issues:0Issues:0

KVQuant

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Stargazers:0Issues:0Issues:0

llamafile

Distribute and run LLMs with a single file.

License:NOASSERTIONStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

MiniMA

Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"

License:Apache-2.0Stargazers:0Issues:0Issues:0

mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

License:MITStargazers:0Issues:0Issues:0

octopus-v4

AI for all: Build the large graph of the language models

License:NOASSERTIONStargazers:0Issues:0Issues:0
License:CC-BY-4.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Pruner-Zero

Evolving Symbolic Pruning Metric from scratch

License:MITStargazers:0Issues:0Issues:0

QQQ

QQQ is an innovative and hardware-optimized W4A8 quantization solution.

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

ShiftAddLLM

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

License:Apache-2.0Stargazers:0Issues:0Issues:0

transformerlab-app

Experiment with Large Language Models

License:AGPL-3.0Stargazers:0Issues:0Issues:0

UDR

ACL'23: Unified Demonstration Retriever for In-Context Learning

Stargazers:0Issues:0Issues:0