Nianhui Guo (NicoNico6)

NicoNico6

Geek Repo

Company:Hasso Plattner Institute (HPI)

Location:Potsdam, German

Github PK Tool:Github PK Tool

Nianhui Guo's repositories

Language:PythonStargazers:1Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf

License:Apache-2.0Stargazers:0Issues:0Issues:0

ColossalAI

Making big AI models cheaper, easier, and scalable

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FastChat

The release repo for "Vicuna: An Open Chatbot Impressing GPT-4"

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

License:Apache-2.0Stargazers:0Issues:0Issues:0

KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

License:MITStargazers:0Issues:0Issues:0

Lion-vs-Adam

Lion and Adam optimization comparison

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

LLaMA-Efficient-Tuning

Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA)

License:Apache-2.0Stargazers:0Issues:0Issues:0

LLM-Pruner

LLM-Pruner: On the Structural Pruning of Large Language Models

License:Apache-2.0Stargazers:0Issues:0Issues:0

MiniMA

Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"

License:Apache-2.0Stargazers:0Issues:0Issues:0

mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

License:MITStargazers:0Issues:0Issues:0

MS-AMP

Microsoft Automatic Mixed Precision Library

License:MITStargazers:0Issues:0Issues:0

NBCE

Naive Bayes-based Context Extension

Stargazers:0Issues:0Issues:0

OmniQuant

OmniQuant is a simple and powerful quantization technique for LLMs.

Stargazers:0Issues:0Issues:0
License:CC-BY-4.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

QIGen

Repository for CPU Kernel Generation for LLM Inference

Stargazers:0Issues:0Issues:0

qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

QuIP-for-Llama

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models

Stargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

soft-prompt-tuning

Prompt tuning for GPT-J

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

tiger

A Tight-fisted Optimizer

License:MITStargazers:0Issues:0Issues:0

torch-int

This repository contains integer operators on GPUs for PyTorch.

License:MITStargazers:0Issues:0Issues:0