neos's repositories

tflite-micro

TensorFlow Lite for Microcontrollers

Language:C++License:Apache-2.0Stargazers:1Issues:1Issues:0

bitsandbytes

8-bit CUDA functions for PyTorch

Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:CLicense:NOASSERTIONStargazers:0Issues:1Issues:0

bragghls

PyTorch model to RTL flow for low latency inference

Language:SystemVerilogLicense:MITStargazers:0Issues:1Issues:0

codon

A high-performance, zero-overhead, extensible Python compiler using LLVM

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

cpython

The Python programming language

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

crack_leetcode

五天刷题,三天模拟!快速掌握leetcode解题套路!

Stargazers:0Issues:0Issues:0

CUDA-Programming

Sample codes for my CUDA programming book

Language:CudaLicense:GPL-3.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

firesim-nvdla

FireSim-NVDLA: NVIDIA Deep Learning Accelerator (NVDLA) Integrated with RISC-V Rocket Chip SoC Running on the Amazon FPGA Cloud

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers"

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

GPTQ-for-LLaMa

4 bits quantization of LLaMa using GPTQ

Language:PythonStargazers:0Issues:1Issues:0

hidet

An open-source efficient deep learning framework.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:CudaStargazers:0Issues:1Issues:0

ITRI-OpenDLA

OpenDLA for trying the demo and FPGA solution

Language:CStargazers:0Issues:1Issues:0

lc3-vm

Write your own virtual machine for the LC-3 computer!

Language:MakefileStargazers:0Issues:1Issues:0

llama-int8

Quantized inference code for LLaMA models

Language:PythonLicense:GPL-3.0Stargazers:0Issues:1Issues:0

llama.cpp

Port of Facebook's LLaMA model in C/C++

Language:CLicense:MITStargazers:0Issues:1Issues:0

llama.onnx

llama onnx models and onnxruntime demo

Language:PythonLicense:GPL-3.0Stargazers:0Issues:1Issues:0

MegFlow

Efficient ML solution for long-tailed demands.

Language:RustLicense:Apache-2.0Stargazers:0Issues:1Issues:0

mlir-emitc

Conversions to MLIR EmitC

Language:C++License:Apache-2.0Stargazers:0Issues:1Issues:0

nncg

NNCG: A Neural Network Code Generator

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++License:MITStargazers:0Issues:1Issues:0

qlib

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. An increasing number of SOTA Quant research works/papers are released in Qlib.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

relay-bench

A repository containing examples and benchmarks for Relay.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

relay-mlir

An MLIR-based toy DL compiler for TVM Relay.

Language:C++Stargazers:0Issues:1Issues:0
Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:1Issues:0

tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; MCUNetV3: On-Device Training Under 256KB Memory

Language:CLicense:MITStargazers:0Issues:1Issues:0

WaifuXL

State of the art image upscaling, directly in your browser.

Language:JavaScriptLicense:Apache-2.0Stargazers:0Issues:1Issues:0