Deepware (deepware-ai)

Deepware

deepware-ai

Geek Repo

High Perfor­mance Compu­ting on FPGA

Home Page:https://deepware.ru

Github PK Tool:Github PK Tool

Deepware's repositories

Sextans

An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).

License:MITStargazers:0Issues:0Issues:0

SparseP

SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) architectures. [https://arxiv.org/abs/2201.05072]

License:MITStargazers:0Issues:0Issues:0

Serpens

An HBM FPGA based SpMV Accelerator

License:MITStargazers:0Issues:0Issues:0

trans-fat

An FPGA Accelerator for Transformer Inference (BERT)

Stargazers:0Issues:0Issues:0

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

License:Apache-2.0Stargazers:1Issues:0Issues:0

gemm_spmm

Hardware accelerator for pruned nertworks

License:GPL-2.0Stargazers:1Issues:0Issues:0

SEAsynth

A synthesize-able CNN accelerator based on systolic arrays 🌊

Stargazers:2Issues:0Issues:0

EdgeBERT

HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference

License:NOASSERTIONStargazers:0Issues:0Issues:0

Paddle-Lite

Multi-platform high performance deep learning inference engine (『飞桨』多平台高性能深度学习预测引擎)

License:Apache-2.0Stargazers:0Issues:0Issues:0

SpinalHDL_CNN_Accelerator

CNN accelerator implemented with Spinal HDL

License:GPL-3.0Stargazers:0Issues:0Issues:0

dory

A tool to deploy Deep Neural Networks on PULP-based SoC's

License:Apache-2.0Stargazers:0Issues:0Issues:0

edge-ai

A curated list of resources for embedded AI

Stargazers:1Issues:0Issues:0

nemo

NEural Minimizer for pytOrch

License:Apache-2.0Stargazers:0Issues:0Issues:0

lenet5_hls

FPGA Accelerator for CNN using Vivado HLS

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

neural-compressor

Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision, sparsity, pruning, knowledge distillation, cross different deep learning frameworks to purse best inference performance.

License:Apache-2.0Stargazers:0Issues:0Issues:0

openvino_tensorflow

OpenVINO™ integration with TensorFlow

License:NOASSERTIONStargazers:0Issues:0Issues:0

bnna

bnn accelerator

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

FPGA_AcceleratorWrapper

Accelerator wrapper with AXI3 DMA and AXI Lite for control

Stargazers:0Issues:0Issues:0

approximate-spmv-topk

Public repostory for the DAC 2021 paper "Scaling up HBM Efficiency of Top-K SpMV forApproximate Embedding Similarity on FPGAs"

License:MITStargazers:0Issues:0Issues:0

MVU

Neural Network accelerator powered by MVUs and RISC-V.

License:MITStargazers:0Issues:0Issues:0

PE-array-for-LeNet-accelerator-based-on-FPGA

This is a 4*5 PE array for LeNet accelerator based on FPGA.

Stargazers:0Issues:0Issues:0

Yolo-Fastest

:zap: Based on yolo's ultra-lightweight universal target detection algorithm, the calculation amount is only 250mflops, the ncnn model size is only 666kb, the Raspberry Pi 3b can run up to 15fps+, and the mobile terminal can run up to 178fps+

License:NOASSERTIONStargazers:0Issues:0Issues:0

XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

License:NOASSERTIONStargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0
License:NOASSERTIONStargazers:0Issues:0Issues:0

ara

The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 0.10, working as a coprocessor to CORE-V's CVA6 core

License:NOASSERTIONStargazers:0Issues:0Issues:0

hci

Heterogeneous Cluster Interconnect to bind special-purpose HW accelerators with general-purpose cluster cores

License:NOASSERTIONStargazers:0Issues:0Issues:0