IST Austria Distributed Algorithms and Systems Lab (IST-DASLab)

IST Austria Distributed Algorithms and Systems Lab

IST-DASLab

Geek Repo

Home Page:https://ista.ac.at/en/research/alistarh-group/

Github PK Tool:Github PK Tool

IST Austria Distributed Algorithms and Systems Lab's repositories

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Language:PythonLicense:Apache-2.0Stargazers:1804Issues:29Issues:48

sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Language:PythonLicense:Apache-2.0Stargazers:666Issues:16Issues:31

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:473Issues:14Issues:24

qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Language:PythonLicense:Apache-2.0Stargazers:256Issues:6Issues:5
Language:PythonLicense:Apache-2.0Stargazers:254Issues:7Issues:4

QUIK

Repository for the QUIK project, enabling the use of 4bit kernels for generative inference

Language:C++License:Apache-2.0Stargazers:162Issues:6Issues:6

OBC

Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

SparseFinetuning

Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry

Language:PythonLicense:Apache-2.0Stargazers:35Issues:5Issues:6
Language:PythonLicense:Apache-2.0Stargazers:28Issues:7Issues:1

QIGen

Repository for CPU Kernel Generation for LLM Inference

Language:PythonStargazers:25Issues:6Issues:0
Language:CudaLicense:Apache-2.0Stargazers:17Issues:6Issues:1

spdy

Code for ICML 2022 paper "SPDY: Accurate Pruning with Speedup Guarantees"

Language:C++License:Apache-2.0Stargazers:13Issues:7Issues:0

peft-rosa

A fork of the PEFT library, supporting Robust Adaptation (RoSA)

Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

CrAM

Code for reproducing the results from "CrAM: A Compression-Aware Minimizer" accepted at ICLR 2023

Language:PythonLicense:Apache-2.0Stargazers:8Issues:5Issues:0

MicroAdam

This repository contains code for the MicroAdam paper.

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:6Issues:6Issues:0

CAP

Repository for Correlation Aware Prune (NeurIPS23) source and experimental code

Language:PythonLicense:Apache-2.0Stargazers:4Issues:5Issues:0

EFCP

The repository contains code to reproduce the experiments from our paper Error Feedback Can Accurately Compress Preconditioners available below:

Language:PythonLicense:Apache-2.0Stargazers:4Issues:5Issues:0
Language:PythonLicense:Apache-2.0Stargazers:4Issues:0Issues:0

pruned-vision-model-bias

Code for reproducing the paper "Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures"

Language:Jupyter NotebookStargazers:4Issues:5Issues:0

TACO4NLP

Task aware compression for various NLP tasks

KDVR

Code for the experiments in Knowledge Distillation Performs Partial Variance Reduction, NeurIPS 2023

Language:PythonLicense:Apache-2.0Stargazers:1Issues:5Issues:0

Mathador-LM

Code for the paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".

Language:PythonLicense:Apache-2.0Stargazers:1Issues:4Issues:0

ZipLM

Code for the NeurIPS 2023 paper: "ZipLM: Inference-Aware Structured Pruning of Language Models".

Stargazers:0Issues:11Issues:1
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FastOBQ-

GPTQ with finetuning

Stargazers:0Issues:5Issues:0

GridSearcher

GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llm-foundry

LLM training code for Databricks foundation models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

SPADE

Code of SPADE: Sparsity Guided Debugging for Deep Neural Networks

Language:Jupyter NotebookStargazers:0Issues:0Issues:0