tengdecheng

tengdecheng

Geek Repo

Location:China

Home Page:https://tengdecheng-linux.github.io

Github PK Tool:Github PK Tool

tengdecheng's starred repositories

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:16868Issues:179Issues:70

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:10759Issues:104Issues:780

cs231n.github.io

Public facing notes page

Language:Jupyter NotebookLicense:MITStargazers:9918Issues:535Issues:111

Theano

Theano was a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It is being continued as PyTensor: www.github.com/pymc-devs/pytensor

Language:PythonLicense:NOASSERTIONStargazers:9852Issues:541Issues:2680

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5379Issues:46Issues:920

openmlsys-zh

《Machine Learning Systems: Design and Implementation》- Chinese Version

HIP

HIP: C++ Heterogeneous-Compute Interface for Portability

riscv-isa-sim

Spike, a RISC-V ISA Simulator

Language:CLicense:NOASSERTIONStargazers:2187Issues:132Issues:765

SparseConvNet

Submanifold sparse convolutional networks

Language:C++License:NOASSERTIONStargazers:1992Issues:44Issues:222

TensorComprehensions

A domain specific language to express machine learning workloads.

Language:C++License:Apache-2.0Stargazers:1754Issues:108Issues:176

spconv

Spatial Sparse Convolution Library

Language:PythonLicense:Apache-2.0Stargazers:1714Issues:23Issues:667

huggingface_hub

The official Python client for the Huggingface Hub.

Language:PythonLicense:Apache-2.0Stargazers:1669Issues:59Issues:790

hw

RTL, Cmodel, and testbench for NVDLA

Language:VerilogLicense:NOASSERTIONStargazers:1616Issues:168Issues:345

torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Language:C++License:NOASSERTIONStargazers:1173Issues:247Issues:606
Language:C++License:Apache-2.0Stargazers:997Issues:84Issues:1988

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++License:MITStargazers:919Issues:44Issues:204

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:626Issues:3Issues:6

mlir-tutorial

MLIR For Beginners tutorial

pytensor

PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Language:PythonLicense:NOASSERTIONStargazers:244Issues:12Issues:300

NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

Language:CLicense:Apache-2.0Stargazers:237Issues:9Issues:27

llvm-project

PLCT实验室的 RISC-V V Spec 实现,基于llvm/llvm-project,rkruppe/rvv-llvm 和 https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi-0.8

HIP-CPU

An implementation of HIP that works on CPUs, across OSes.

Language:C++License:MITStargazers:104Issues:18Issues:37

tvm_gpu_gemm

play gemm with tvm

SHARK-Turbine

Unified compiler/runtime for interfacing with PyTorch Dynamo.

Language:PythonLicense:Apache-2.0Stargazers:71Issues:28Issues:382

conv3x3_m1

This is a demo how to write a high performance convolution run on apple silicon

Language:C++License:GPL-3.0Stargazers:50Issues:4Issues:0

MEC

ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)

Language:C++License:MITStargazers:18Issues:1Issues:1

rvv-benchmark

PLCT实验室 rvv-llvm 实现配套的 benchmark / testcases

Language:AssemblyLicense:CC-BY-4.0Stargazers:18Issues:0Issues:0

countdownlatchcpp

CountDownLatch in C++

Language:C++License:MITStargazers:11Issues:3Issues:0
Language:C++License:MITStargazers:10Issues:0Issues:0

fixed_point_math

a templated header-only fixed point math library for C++

Language:C++Stargazers:6Issues:0Issues:0