gitbisector / vitisbertl

experiments for low latency BERT large inference on Alveo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

vitisbertl

experiments for low latency BERT large inference on Alveo U50.

vitis_hls C++ code module called "feeder" is a matrix multiplication kernel with 1024 parallel DSPs.

It implements (nmat*1024, 1024) . (1024, vec) when nmat in [1,8] and vec in [1,128]

About

experiments for low latency BERT large inference on Alveo

License:Apache License 2.0


Languages

Language:C++ 83.2%Language:Makefile 16.1%Language:C 0.7%