hkeee21

including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware verilog code that can be implemented on FPGA

Language:SystemVerilogApache-2.0000

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.0000

LLaMA-fast

Language:PythonGPL-3.0000

PCEngine-AE

Language:PythonMIT010

sparse-op-test

Some test files to measure the performance of spmm, sddmm and spgemm on GPU.

Language:Cuda010

vllm_qwen

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000

xformers-hacked

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION000