zineos

followers

following

stars

Organizations

TeamWiseFlow

neos's repositories

tflite-micro

TensorFlow Lite for Microcontrollers

Language:C++Apache-2.01 10

bitsandbytes

8-bit CUDA functions for PyTorch

Language:PythonMIT010

blue-porcelain

Language:CNOASSERTION010

bragghls

PyTorch model to RTL flow for low latency inference

Language:SystemVerilogMIT010

codon

A high-performance, zero-overhead, extensible Python compiler using LLVM

Language:C++NOASSERTION010

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Language:C++NOASSERTION010

cpython

The Python programming language

Language:PythonNOASSERTION010

crack_leetcode

五天刷题，三天模拟！快速掌握leetcode解题套路！

000

CUDA-Programming

Sample codes for my CUDA programming book

Language:CudaGPL-3.0000

EtinyNet

Language:Python010

firesim-nvdla

FireSim-NVDLA: NVIDIA Deep Learning Accelerator (NVDLA) Integrated with RISC-V Rocket Chip SoC Running on the Amazon FPGA Cloud

Language:PythonNOASSERTION010

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers"

Language:PythonApache-2.0010

GPTQ-for-LLaMa

4 bits quantization of LLaMa using GPTQ

Language:Python010

hidet

An open-source efficient deep learning framework.

Language:PythonApache-2.0010

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:Cuda010

ITRI-OpenDLA

OpenDLA for trying the demo and FPGA solution

Language:C010

lc3-vm

Write your own virtual machine for the LC-3 computer!

Language:Makefile010

llama-int8

Quantized inference code for LLaMA models

Language:PythonGPL-3.0010

llama.cpp

Port of Facebook's LLaMA model in C/C++

Language:CMIT010

llama.onnx

llama onnx models and onnxruntime demo

Language:PythonGPL-3.0010

MegFlow

Efficient ML solution for long-tailed demands.

Language:RustApache-2.0010

mlir-emitc

Conversions to MLIR EmitC

Language:C++Apache-2.0010

nncg

NNCG: A Neural Network Code Generator

Language:PythonApache-2.0010

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++MIT010

qlib

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. An increasing number of SOTA Quant research works/papers are released in Qlib.

Language:PythonMIT010

relay-bench

A repository containing examples and benchmarks for Relay.

Language:PythonApache-2.0010

relay-mlir

An MLIR-based toy DL compiler for TVM Relay.

Language:C++010

stable-diffusion

Language:Jupyter NotebookNOASSERTION010

tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; MCUNetV3: On-Device Training Under 256KB Memory

Language:CMIT010

WaifuXL

State of the art image upscaling, directly in your browser.

Language:JavaScriptApache-2.0010