huafeng (gitover22)

gitover22

User data from Github https://github.com/gitover22

Company:Pursing M.S. in UCAS.

Location:China

GitHub:@gitover22

huafeng's repositories

CaptchaNetTrainer

A framework for learning deep learning training testing procedures

Language:PythonStargazers:9Issues:0Issues:0
Stargazers:9Issues:0Issues:0

the-Congestion-Control-Process-in-TCP-in-NS-3

Programming Assignment for CN2023: Understanding the Congestion Control Process in TCP in NS-3.

Language:C++License:GPL-3.0Stargazers:8Issues:0Issues:0

DSF_CNCL

DSM system framework based on CNCL

Language:C++License:MITStargazers:7Issues:0Issues:0

HF_DFS

HF_DFS is a distributed file system that is built on fastDFS.

Language:CLicense:Apache-2.0Stargazers:7Issues:2Issues:0

SuperServer

based on c/c++, high performance server

Language:C++License:MITStargazers:7Issues:0Issues:0

Go_notes

记录go语言学习历程

Language:PythonLicense:Apache-2.0Stargazers:6Issues:1Issues:0

Linux_syscall_demo

test demo for linux syscall.

Language:C++Stargazers:6Issues:1Issues:0

miniCache

the cache is a simple implementation,with write back/allocation

Language:VerilogLicense:GPL-3.0Stargazers:6Issues:0Issues:0

MiniGPT4-on-MLU

将MiniGPT4移植到MLU370上,可以实现多卡训练和推理功能

Language:PythonLicense:BSD-3-ClauseStargazers:5Issues:1Issues:0
Language:CLicense:GPL-3.0Stargazers:4Issues:0Issues:0

cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Language:CLicense:NOASSERTIONStargazers:3Issues:0Issues:0

LLaMA-infer

A Inference Framework for LLaMA Models

Language:CLicense:GPL-3.0Stargazers:3Issues:0Issues:0

llama-study

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:3Issues:0Issues:0

CUDA-Learn-Notes

🎉 CUDA Learn Notes with PyTorch: fp32、fp16/bf16、fp8/int8、flash_attn、sgemm、sgemv、warp/block reduce、dot prod、elementwise、softmax、layernorm、rmsnorm、hist etc.

License:GPL-3.0Stargazers:2Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

License:Apache-2.0Stargazers:2Issues:0Issues:0

glake

GLake: optimizing GPU memory management and IO transmission.

License:Apache-2.0Stargazers:2Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

License:Apache-2.0Stargazers:2Issues:0Issues:0

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

License:MITStargazers:2Issues:0Issues:0

sglang

SGLang is a fast serving framework for large language models and vision language models.

License:Apache-2.0Stargazers:2Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

License:Apache-2.0Stargazers:2Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:2Issues:0Issues:0

cambricon-pytorch

Build Cambricon PyTorch from source

Stargazers:1Issues:0Issues:0