wong_hs (takeshineshiro)

takeshineshiro

Geek Repo

Location:朱辛庄

Github PK Tool:Github PK Tool

wong_hs's repositories

riscv-iommu

IOMMU IP compliant with the RISC-V IOMMU Specification v1.0

License:Apache-2.0Stargazers:0Issues:0Issues:0

mgpusim

A highly-flexible GPU simulator for AMD GPUs.

License:MITStargazers:0Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

License:Apache-2.0Stargazers:0Issues:0Issues:0

vulkan-sim

Vulkan-Sim is a GPU architecture simulator for Vulkan ray tracing based on GPGPU-Sim and Mesa.

License:NOASSERTIONStargazers:0Issues:0Issues:0

ramulator2

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM standards, emerging RowHammer mitigation techniques). Described in our paper https://people.inf.ethz.ch/omutlu/pub/Ramulator2_arxiv23.pdf

License:MITStargazers:0Issues:0Issues:0

tpu-mlir

Machine learning compiler based on MLIR for Sophgo TPU.

License:NOASSERTIONStargazers:0Issues:0Issues:0

ROCm

AMD ROCm™ Software - GitHub Home

License:MITStargazers:0Issues:0Issues:0

brpc

brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".

License:Apache-2.0Stargazers:0Issues:0Issues:0

iob-cache

Verilog Configurable Cache

License:MITStargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

License:Apache-2.0Stargazers:0Issues:0Issues:0

esp

Embedded Scalable Platforms: Heterogeneous SoC architecture and IP integration made easy

License:NOASSERTIONStargazers:0Issues:0Issues:0

MegCC

MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器

License:Apache-2.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

License:MITStargazers:0Issues:0Issues:0

iDMA

A modular, parametrizable, and highly flexible Data Movement Accelerator (DMA)

License:NOASSERTIONStargazers:0Issues:0Issues:0

opentitan

OpenTitan: Open source silicon root of trust

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Coyote

Framework providing operating system abstractions and a range of shared networking (RDMA, TCP/IP) and memory services to common modern heterogeneous platforms.

License:MITStargazers:0Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

License:Apache-2.0Stargazers:0Issues:0Issues:0

gloo

Collective communications library with various primitives for multi-machine training.

License:NOASSERTIONStargazers:0Issues:0Issues:0

start-ai-compiler

Start AI Compiler

License:MITStargazers:0Issues:0Issues:0

lbt

Develop toolchain based on llvm to for Cpu0 processor

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

openmlsys-zh

《Machine Learning Systems: Design and Implementation》- Chinese Version

Stargazers:0Issues:0Issues:0

ML-Accelerators

Topics in Machine Learning Accelerator Design

License:Apache-2.0Stargazers:0Issues:0Issues:0

DeepLearningSystem

Deep Learning System core principles introduction.

License:Apache-2.0Stargazers:0Issues:0Issues:0

NOCulator

NOCulator is a network-on-chip simulator providing cycle-accurate performance models for a wide variety of networks (mesh, torus, ring, hierarchical ring, flattened butterfly) and routers (buffered, bufferless, Adaptive Flow Control, minBD, HiRD).

License:MITStargazers:0Issues:0Issues:0

Ripes

A graphical processor simulator and assembly editor for the RISC-V ISA

License:MITStargazers:0Issues:0Issues:0

CUDA-Programming-Guide-in-Chinese

This is a Chinese translation of the CUDA programming guide

Stargazers:0Issues:0Issues:0

book

backup some books

Stargazers:0Issues:0Issues:0

AI-Chip

A list of ICs and IPs for AI, Machine Learning and Deep Learning.

Stargazers:0Issues:0Issues:0