yawen_Li (a243845305)

a243845305

Geek Repo

Location:Shen zhen

Github PK Tool:Github PK Tool

yawen_Li's starred repositories

YHs_Sample

Yinghan's Code Sample

Language:CudaLicense:GPL-3.0Stargazers:243Issues:0Issues:0
Language:CudaStargazers:44Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:59935Issues:0Issues:0
Language:CudaStargazers:18Issues:0Issues:0

NVIDIA-OpenCL-Samples

可编译的 nvidia opencl 官方 实例代码,https://developer.nvidia.com/opencl

Language:CStargazers:2Issues:0Issues:0

iphone_dcim_backup

back up iphone photo

Language:PythonStargazers:5Issues:0Issues:0

ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Language:PythonLicense:Apache-2.0Stargazers:1415Issues:0Issues:0
Language:CStargazers:91Issues:0Issues:0

mperf

mperf是一个面向移动/嵌入式平台的算子性能调优工具箱

Language:C++License:Apache-2.0Stargazers:166Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:242Issues:0Issues:0

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language:CudaLicense:Apache-2.0Stargazers:720Issues:0Issues:0

workflow

C++ Parallel Computing and Asynchronous Networking Framework

Language:C++License:Apache-2.0Stargazers:12584Issues:0Issues:0

QualcommOpenCLSDKNote

The note of Qualcomm OpenCL SDK

Language:C++Stargazers:22Issues:0Issues:0

sgemm_hsw

This is an implementation of sgemm_kernel on L1d cache.

Language:AssemblyLicense:GPL-3.0Stargazers:214Issues:0Issues:0

cpu-cache-test

cpu cache延迟实验

Language:CStargazers:1Issues:0Issues:0

OpenCL-correlation-using-local-memory

Correlation demo in OpenCL that uses local memory.

Language:CStargazers:1Issues:0Issues:0

memtestCL

OpenCL memory tester for GPUs

Language:C++License:NOASSERTIONStargazers:118Issues:0Issues:0

libpag

The official rendering library for PAG (Portable Animated Graphics) files that renders After Effects animations natively across multiple platforms.

Language:C++License:NOASSERTIONStargazers:4790Issues:0Issues:0

shoc

The SHOC Benchmark Suite

Language:MakefileLicense:NOASSERTIONStargazers:238Issues:0Issues:0
Stargazers:44Issues:0Issues:0

ppl.nn

A primitive library for neural network

Language:C++License:Apache-2.0Stargazers:1236Issues:0Issues:0
Language:C++Stargazers:80Issues:0Issues:0

CUDA_gemm

A simple high performance CUDA GEMM implementation.

Language:CudaStargazers:292Issues:0Issues:0

Cplusplus-Concurrency-In-Practice

A Detailed Cplusplus Concurrency Tutorial 《C++ 并发编程指南》

Language:C++License:MITStargazers:5204Issues:0Issues:0

mmcv

OpenMMLab Computer Vision Foundation

Language:PythonLicense:Apache-2.0Stargazers:5694Issues:0Issues:0

mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

Language:PythonLicense:Apache-2.0Stargazers:3241Issues:0Issues:0

mobilenet-ssd-snpe

mobilenet-ssd snpe demo

Language:C++Stargazers:39Issues:0Issues:0
Language:PythonLicense:MITStargazers:716Issues:0Issues:0

TNN

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.

Language:C++License:NOASSERTIONStargazers:4305Issues:0Issues:0

CPlusPlusThings

C++那些事

Language:C++Stargazers:37787Issues:0Issues:0