Yujia Zhai (yzhaiustc)

yzhaiustc

Geek Repo

Company:@NVIDIA

Location:Santa Clara, California

Home Page:https://yzhaiustc.github.io/

Github PK Tool:Github PK Tool

Yujia Zhai's repositories

Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Language:CudaLicense:GPL-3.0Stargazers:257Issues:6Issues:7

Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

Language:CLicense:GPL-3.0Stargazers:96Issues:4Issues:1

Optimizing-SGEMV-on-NVIDIA-GPUs

An implementation of SGEMV with performance comparable to cuBLAS.

Language:CudaLicense:GPL-3.0Stargazers:7Issues:3Issues:1

ftblas

A high-performance BLAS implementation with online fault tolerance.

Language:C++License:GPL-3.0Stargazers:5Issues:4Issues:0

Optimizing-DGEMV-on-Intel-CPUs

Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.

Language:CLicense:GPL-3.0Stargazers:3Issues:3Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:1Issues:2Issues:0

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:1Issues:2Issues:0

alshedivat-al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:JavaScriptLicense:MITStargazers:0Issues:2Issues:0

antares

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

blis

BLAS-like Library Instantiation Software Framework

Language:CLicense:NOASSERTIONStargazers:0Issues:2Issues:0
Language:C++Stargazers:0Issues:2Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

effective_transformer

Running BERT without Padding

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0

esecurity

MSc Module

Language:JavaScriptStargazers:0Issues:2Issues:0

HElib

HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizations for efficient homomorphic evaluation, focusing on effective use of ciphertext packing techniques and on the Gentry-Halevi-Smart optimizations.

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

hexl

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0
Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0
Language:LogosLicense:NOASSERTIONStargazers:0Issues:2Issues:0

ML2021-Spring

**Official** 李宏毅 (Hung-yi Lee) 機器學習 Machine Learning 2021 Spring

Language:Jupyter NotebookStargazers:0Issues:2Issues:0

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0

PP-CNN

Privacy Preserving Convolutional Neural Network using Homomorphic Encryption for secure inference

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0

rocSOLVER

Next generation LAPACK implementation for ROCm platform

Language:C++License:BSD-2-ClauseStargazers:0Issues:2Issues:0

SEAL

Microsoft SEAL is an easy-to-use and powerful homomorphic encryption library.

Language:C++License:MITStargazers:0Issues:2Issues:0

sycl-blas

An implementation of BLAS using the SYCL open standard for acceleration on OpenCL devices

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0

sycltrain

Training examples for SYCL

Language:C++License:MITStargazers:0Issues:2Issues:0

tfhe

TFHE: Fast Fully Homomorphic Encryption Library over the Torus

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0
Language:C++Stargazers:0Issues:1Issues:0

xbyak

a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2/AVX-512 by C++ header

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

yzhaiustc.github.io

My personal website

Language:JavaScriptLicense:MITStargazers:0Issues:2Issues:0