wplf

李金梁's repositories

MIT_6.5940

MIT open course, efficient ML

Language:Jupyter Notebook2 10

CMU-10-714

CMU 10-714 Deep-Learning-Systems

Language:Jupyter Notebook010

Compass Optimizer (OPT for short), is part of the Zhouyi Compass Neural Network Compiler. The OPT is designed for converting the float Intermediate Representation (IR) generated by the Compass Unified Parser to an optimized quantized or mixed IR which is suited for Zhouyi NPU hardware platforms.

Language:PythonApache-2.0000

Compass_Unified_Parser

armchina NPU parser

Language:PythonApache-2.0000

Competitive_Programming

WPLF template

010

cs-self-learning

计算机自学指南

Language:HTMLMIT000

how-to-optimize-gemm

row-major matmul optimization

Language:C++GPL-3.0000

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language:CudaApache-2.0000

lzzplus2x

lzzkmc_wplf_changed

Language:C++010

MIT-6.031-Software-Construction

The record of learning 6.031

Language:Java010

OI-wiki

:star2: Wiki of OI / ICPC for everyone. （某大型游戏线上攻略，内含炫酷算术魔法）

Language:TypeScript000

onnx

Open standard for machine learning interoperability

Language:PythonApache-2.0000

UCB-CS61c-2020summer

Language:C010

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

mit-65840

000

tinyflow

Tutorial code on how to build your own Deep Learning System in 2k Lines

Language:C++Apache-2.0000

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Apache-2.0000

This is a special repository about my Github profile.

Apache-2.0010

wplf.github.io

Language:HTMLMIT000

wplf

李金梁's repositories

MIT_6.5940

my-CS-road

py_tutorial

UCB-CS161-sp24

CMU-10-714

Compass_Optimizer

Compass_Unified_Parser

Competitive_Programming

cs-self-learning

how-to-optimize-gemm

How_to_optimize_in_GPU

lzzplus2x

MIT-6.031-Software-Construction

OI-wiki

onnx

UCB-CS61c-2020summer

Megatron-LM

mit-65840

tinyflow

TransformerEngine

wplf

wplf.github.io