Hkeee's repositories
ByteEngine
An LLM engine based on ByteTransformer.
chatglm-throughput
A plugin to measure the throughput of LLMs like chatGLM.
FuseSage
Multistream accelerating strategy in GraphSage
gnn-acceleration-framework-with-FPGA
including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware verilog code that can be implemented on FPGA
lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
sparse-op-test
Some test files to measure the performance of spmm, sddmm and spgemm on GPU.
vllm_qwen
A high-throughput and memory-efficient inference and serving engine for LLMs
xformers-hacked
Hackable and optimized Transformers building blocks, supporting a composable construction.