Mengchi Zhang (brad-mengchi)

brad-mengchi

Geek Repo

Company:Meta

Location:Menlo Park

Home Page:https://sites.google.com/site/mengchizhang/

Github PK Tool:Github PK Tool

Mengchi Zhang's starred repositories

free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍,欢迎投稿

License:GPL-3.0Stargazers:110974Issues:5890Issues:0

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:25971Issues:718Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:GPL-2.0Stargazers:9118Issues:360Issues:0

attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Language:PythonLicense:MITStargazers:8703Issues:95Issues:181

cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Language:CLicense:NOASSERTIONStargazers:6004Issues:117Issues:234

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonLicense:Apache-2.0Stargazers:4520Issues:82Issues:242

asmjit

Low-latency machine code generation

Language:C++License:ZlibStargazers:3914Issues:153Issues:323

HIP

HIP: C++ Heterogeneous-Compute Interface for Portability

AI-Chip

A list of ICs and IPs for AI, Machine Learning and Deep Learning.

gpu

Dissecting the M1's GPU for 3D acceleration

Language:CStargazers:982Issues:81Issues:0

x86-assembly-cheat

MOVED TO: https://************.com/linux-kernel-module-cheat/userland-assembly with code at https://github.com/************/linux-kernel-module-cheat/tree/master/userland/arch/x86_64 SEE README. x86 IA-32 and x86-64 userland minimal examples tutorial. Hundreds of runnable asserts. Nice GDB setup. IO done with libc, so OS portable in theory. NASM and GAS covered. Tested in Ubuntu 18.04. Containers (ELF), linking, calling conventions. System land cheat at: https://github.com/************/x86-bare-metal-examples, ARM cheat at: https://github.com/************/arm-assembly-cheat 移至:https://github.com/************/linux-kernel-module-cheat#userland-assembly请参阅自述文件。 x86 IA-32和x86-64 userland最少示例教程。 数百个可运行的断言。 好的GDB设置。 IO是用libc完成的,因此OS在理论上是可移植的。 涵盖了NASM和GAS。 在Ubuntu 18.04中测试。 容器(ELF),链接,调用约定。 系统土地作弊网址:https://github.com/************/x86-bare-metal-examples,ARM作弊网址:https://github.com/************/arm-assembly-cheat

DeathStarBench

Open-source benchmark suite for cloud microservices

Language:LuaLicense:Apache-2.0Stargazers:718Issues:22Issues:166

llvm-pass-skeleton

example LLVM pass

Language:C++License:MITStargazers:548Issues:16Issues:25

autofdo

AutoFDO

Language:C++License:Apache-2.0Stargazers:512Issues:30Issues:107

hotcrp

HotCRP conference review software

Language:PHPLicense:NOASSERTIONStargazers:325Issues:22Issues:307

accel-sim-framework

This is the top-level repository for the Accel-Sim framework.

Language:PythonLicense:NOASSERTIONStargazers:284Issues:10Issues:183

libcxx

libc++; cloned from http://llvm.org/git/libcxx.git

Language:C++License:NOASSERTIONStargazers:185Issues:19Issues:1

machine_learning

a collection of packages for ML projects, written in Tensorflow's Python API

cub

THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.

Language:CudaLicense:BSD-3-ClauseStargazers:79Issues:5Issues:0

MicroSuite

µSuite: A Benchmark Suite for Microservices

Language:C++License:BSD-3-ClauseStargazers:41Issues:4Issues:6

source-glibc

notes about glibc, ld-so and more.

Language:CStargazers:34Issues:3Issues:0

GaloisGPU

LonestarGPU: Irregular algorithms parallelized for GPUs

Language:C++License:NOASSERTIONStargazers:33Issues:15Issues:1

heterosync

HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs

Language:CudaLicense:NOASSERTIONStargazers:26Issues:3Issues:1

hotcrp-docker-compose

An easy docker-compose deployement of hotcrp

amd-llvm-project

DO NOT USE : Deprecated : Mirror of AMD llvm-project : The source repo is https://github.com/RadeonOpenCompute/llvm-project. Several times a day the default branch "amd-stg-open" is updated from the source repo and then locked. The purpose of this repo is to share and test fixes that have not yet gone into a review process.

MightyPC

Mighty toolkit for conference Program Chairs.

Language:PythonLicense:MITStargazers:7Issues:4Issues:1

ISCA-2021-Script

A collection of redistributable Python scripts to help organize ISCA 2021 (The 48th International Symposium on Computer Architecture).

Language:PythonLicense:GPL-2.0Stargazers:5Issues:1Issues:0

gpu_unified_cache

GPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs

Language:CudaStargazers:4Issues:1Issues:0