umiswing

umiswing

User data from Github https://github.com/umiswing

Company:NEU

Location:China

Home Page:https://umiswing.github.io/

GitHub:@umiswing


Organizations
AyakaGEMM
PaddlePaddle

umiswing's repositories

NiuTrans.NMT

A Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.

Language:C++License:Apache-2.0Stargazers:1Issues:0Issues:0
Language:Emacs LispStargazers:0Issues:1Issues:0
Language:CudaStargazers:0Issues:1Issues:0
Stargazers:0Issues:1Issues:0
Language:CudaStargazers:0Issues:0Issues:0

DocumentSASS

Unofficial description of the CUDA assembly (SASS) instruction sets.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0
Language:CLicense:MITStargazers:0Issues:1Issues:0

emacs-abyss-theme

A dark theme for Emacs

Language:Emacs LispLicense:GPL-3.0Stargazers:0Issues:0Issues:0

emacs-catppuccin

🍄 Soothing pastel theme for Emacs

Language:Emacs LispLicense:MITStargazers:0Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:C++License:BSD-3-ClauseStargazers:0Issues:0Issues:0

flux

A fast communication-overlapping library for tensor parallelism on GPUs.

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:CStargazers:0Issues:0Issues:0

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language:CudaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

maxas

Assembler for NVIDIA Maxwell architecture

Language:SassLicense:MITStargazers:0Issues:0Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:C++Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

PaddleFlashattnTest

Additional tests of flash attention api in paddle

Stargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:CLicense:MITStargazers:0Issues:1Issues:1
Language:HTMLStargazers:0Issues:1Issues:0

YHs_Sample

Yinghan's Code Sample

Language:CudaLicense:GPL-3.0Stargazers:0Issues:0Issues:0