Zhiwei35

Zhiwei35

Geek Repo

0

followers

0

following

Company:Intel

Github PK Tool:Github PK Tool

Zhiwei35's repositories

Language:CudaStargazers:1Issues:0Issues:0

PLCT-Open-Reports

PLCT实验室有关RISC-V和MLIR的slides和report

License:CC-BY-SA-4.0Stargazers:1Issues:0Issues:0

Awesome-GPU

Awesome resources for GPUs

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

code-samples

Source code examples from the Parallel Forall Blog

Language:HTMLLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

Cpp_houjie

侯捷C++课程PPT及代码

Language:C++Stargazers:0Issues:0Issues:0

CPP_Optimizations_Diary

Tips and tricks to optimize your C++ code

Language:C++Stargazers:0Issues:0Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

DeepLearningSystem

Deep Learning System core principles introduction.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

flash_attention_inference

compressed version of flash attn to flash decoding

Language:C++License:MITStargazers:0Issues:0Issues:0
Language:CudaStargazers:0Issues:0Issues:0

HPCInfo

Information about many aspects of high-performance computing. Wiki content moved to ~/docs.

Language:C++License:MITStargazers:0Issues:0Issues:0

IOS

[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration

Language:C++License:MITStargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

modern-cpp-tutorial

📚 Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/

Language:C++License:MITStargazers:0Issues:0Issues:0

MyTinySTL

STL class impl in C++11

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

llama.cpp

Pure C/C++ LLaMA

License:MITStargazers:0Issues:0Issues:0
Language:CudaStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

License:MITStargazers:0Issues:0Issues:0

Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

License:GPL-3.0Stargazers:0Issues:0Issues:0

OptimizingSeriesTranslation

Chinese version for Agner Fog's optimizing series

Stargazers:0Issues:0Issues:0

PaddleCustomDevice

PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)

License:Apache-2.0Stargazers:0Issues:0Issues:0

train-LeNet5-by-cuda

train a LeNet5 with Cuda

Stargazers:0Issues:0Issues:0