int8

There are 2 repositories under int8 topic.

intel / neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
low-precision pruning sparsity auto-tuning knowledge-distillation quantization quantization-aware-training post-training-quantization smoothquant large-language-models awq fp4 gptq int4 int8 mxformat sparsegpt
Language:Python 2278
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
cpu fp4 fp8 gaudi2 gpu int1 int2 int3 int4 int5 int6 int7 int8 llamacpp llm-fine-tuning llm-inference low-bit mxformat nf4 sparsity
Language:C++ 351
clancylian / retinaface
Reimplement RetinaFace use C++ and TensorRT
retinaface tensorrt int8 caffe mxnet2caffe
Language:C++ 297
Wulingtian / yolov5_tensorrt_int8_tools
tensorrt int8 量化yolov5 onnx模型
int8 onnx tensorrt yolov5
Language:Python 177
Wulingtian / yolov5_tensorrt_int8
TensorRT int8 量化部署 yolov5s 模型，实测3.3ms一帧！
int8 tensorrt yolov5
Language:C++ 166
Wulingtian / RepVGG_TensorRT_int8
RepVGG TensorRT int8 量化，实测推理不到1ms一帧！
repvgg tensorrt int8
Language:Python 61
xuanandsix / Tensorrt-int8-quantization-pipline
a simple pipline of int8 quantization based on tensorrt.
classifaction int8 quantization tensorrt yolox
Language:Python 56
the0807 / YOLOv8-ONNX-TensorRT
👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera
computer-vision fp16 int8 object-detection onnx tensorrt yolov8
Language:Python 42
Wulingtian / nanodet_tensorrt_int8
nanodet int8 量化，实测推理2ms一帧！
nanodet tensorrt int8
Language:C++ 37
ppogg / ncnn-yolov4-int8
NCNN+Int8+YOLOv4 quantitative modeling and real-time inference
int8 ncnn real-time yolov4
Language:C++ 23
whitelok / tensorrt-int8-python-sample
TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です
tensorrt ai deep-learning nvidia inference tensorrt-int8-python python int8-inference int8 machine-learning
Language:Python 14
aahouzi / llama2-chatbot-cpu
A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.
4-bit-cpu bfloat16 chatbot chatbot-memory chatgpt cpu huggingface int8 intel ipex langchain llama llama2 meta meta-ai neural-compression numa optimization smooth-quantization streamlit
Language:Python 13
Egorundel / int8_calibrator_cpp
INT8 calibrator for ONNX model with dynamic batch_size at the input and NMS module at the output. C++ Implementation.
calibration cpp int8 tensorrt onnx
Language:C++ 10
cbalint13 / rvv-kernels
RISCV Vector Kernel C/LLVM-IR generator
int8 kernel llvm math riscv rvv tvm vector
Language:C 7
egbertYeah / mt-yolov6_tensorrt
MT-Yolov6 TensorRT Inference with Python.
tensorrt int8 yolov6
Language:Python 6
dasdristanta13 / LLM-Lora-PEFT_accumulate
LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and push the boundaries of LLM optimization. Let's make LLMs more efficient together!
alpaca bitsandbytes falcon int8 llama llm lora peft qlora
Language:Jupyter Notebook 5
JohnClaw / chatllm.vb
VB.NET api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm cpu-inference gemma ggml int8 int8-inference int8-quantization llama llm-inference mistral qwen vb-net vbnet
Language:Visual Basic .NET 4
yester31 / Quantization_EX
quantization example for pqt & qat
int8 model-optimization post-training-quantization ptq pytorch-quantization qat quantization quantization-aware-training tensorrt
Language:Python 4
JohnClaw / chatllm.cs
C# api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm csharp gemma ggml int8 int8-inference int8-quantization llama llm-inference mistral qwen cpu-inference inference llm llms
Language:C# 3
stdlib-js / array-int8
Int8Array.
nodejs javascript stdlib node node-js types data structure array typed typed-array int8array int8 integer int signed byte
Language:JavaScript 2
stdlib-js / constants-int8
8-bit signed integer mathematical constants.
nodejs javascript stdlib node node-js mathematics math standard library lib constants const namespace int8 8-bit 8bit integer int byte signed
Language:JavaScript 2
stdlib-js / constants-int8-min
Minimum signed 8-bit integer.
nodejs javascript stdlib node node-js constant const min minimum integer int int8 signed 8-bit
Language:JavaScript 2
lbin / gie_int8_sample
int8 gie
Language:C++ 1
RyannnG / gie_int8_sample
int8 tensorrt2
Language:C++ 1
stdlib-js / assert-is-int8array
Test if a value is an Int8Array.
nodejs javascript stdlib node node-js assertion assert utilities utility utils util int8array int8 signed integer int byte octet typed typed-array
Language:JavaScript 1
stdlib-js / constants-int8-max
Maximum signed 8-bit integer.
nodejs javascript stdlib node node-js constant const max maximum integer int int8 signed 8-bit
Language:JavaScript 1
stdlib-js / constants-int8-num-bytes
Size (in bytes) of an 8-bit signed integer.
nodejs javascript stdlib node node-js constant const mathematics math int8 8-bit 8bit integer int byte signed size sizeof size-of bytes
Language:JavaScript 1
stdlib-js / napi-argv-int8array
Convert a Node-API value to a signed 8-bit integer array.
addon array int int8 javascript macros napi native node node-js nodejs stdlib utilities utils
Language:C 1
stdlib-js / napi-argv-strided-int8array
Convert a Node-API value representing a strided array to a signed 8-bit integer array.
addon array int int8 javascript macros napi native ndarray node node-js nodejs stdlib strided utilities utils
Language:C 1
stdlib-js / napi-argv-strided-int8array2d
Convert a Node-API value representing a two-dimensional strided array to a signed 8-bit integer array.
addon array int8 integer javascript macros matrix napi native ndarray node node-js nodejs stdlib strided utilities utils
Language:C 1
douzsh / mxnet-quantized
mxnet GluonCV quantization binary ternary models
mxnet quantization binary ternary int8 gluoncv
Language:Python 0
MrFMach / Practice-C-types
Practicing C data types using the sizeof function
sizeof c char int8 int16 int32 int64 int128 float double
Language:C
yester31 / Quantization_Framework
development quantization framework
compression infernece int8 optimization quantization
Language:Python

int8

intel / neural-compressor

intel / neural-speed

clancylian / retinaface

Wulingtian / yolov5_tensorrt_int8_tools

Wulingtian / yolov5_tensorrt_int8

Wulingtian / RepVGG_TensorRT_int8

xuanandsix / Tensorrt-int8-quantization-pipline

the0807 / YOLOv8-ONNX-TensorRT

Wulingtian / nanodet_tensorrt_int8

ppogg / ncnn-yolov4-int8

whitelok / tensorrt-int8-python-sample

aahouzi / llama2-chatbot-cpu

Egorundel / int8_calibrator_cpp

cbalint13 / rvv-kernels

egbertYeah / mt-yolov6_tensorrt

dasdristanta13 / LLM-Lora-PEFT_accumulate

JohnClaw / chatllm.vb

yester31 / Quantization_EX

JohnClaw / chatllm.cs

stdlib-js / array-int8

stdlib-js / constants-int8

stdlib-js / constants-int8-min

lbin / gie_int8_sample

RyannnG / gie_int8_sample

stdlib-js / assert-is-int8array

stdlib-js / constants-int8-max

stdlib-js / constants-int8-num-bytes

stdlib-js / napi-argv-int8array

stdlib-js / napi-argv-strided-int8array

stdlib-js / napi-argv-strided-int8array2d

douzsh / mxnet-quantized

MrFMach / Practice-C-types

yester31 / Quantization_Framework