KaneHui

User data from Github https://github.com/KaneHui

followers

following

stars

@lepton.ai

Hangzhou, Zhejiang Province, China

Kane's repositories

AArch64-Explore

Language:Mathematica000

applegpu

Apple G13 GPU architecture docs and tools

Language:HTMLBSD-3-Clause000

ArchProbe

A profiler to disclose and quantify hardware features on GPUs.

Language:C++MIT000

asitop

Perf monitoring CLI tool for Apple Silicon

Language:PythonMIT000

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonNOASSERTION000

BasicCUDA

Language:Cuda000

cformers

SoTA Transformers with C-backend for fast inference on your CPU.

Language:CMIT000

gpt4-pdf-chatbot-langchain

GPT4 & LangChain Chatbot for large PDF docs

Language:TypeScript000

HelloSilicon

An introduction to ARM64 assembly on Apple Silicon Macs

Language:AssemblyMIT000

incubator-tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.0000

insn_bench_aarch64

Instruction latency & throughput profiler for AArch64

Language:C++000

llama.cpp

Port of Facebook's LLaMA model in C/C++

Language:CMIT000

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

000

LLVM_for_cpu0

This is a tutorial to learn LLVM, I realize a backend to compiler machine code for cpu0 which is a simple RISC cpu.

Language:C++000

ml-compiler-opt

Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.

Language:PythonApache-2.0000

MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Language:PythonMIT000

GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Language:PythonApache-2.0000

gpu-benches

collection of benchmarks to measure basic GPU capabilities

Language:Jupyter NotebookGPL-3.0000

langchain

⚡ Building applications with LLMs through composability ⚡

MIT000

llm-viz

3D Visualization of an GPT-style LLM

Language:TypeScript000

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookApache-2.0000

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

000

MOSS

An open-source tool-augmented conversational language model from Fudan University

Apache-2.0000

netron

Visualizer for neural network, deep learning and machine learning models

Language:JavaScriptMIT000

NVIDIA_SGEMM_PRACTICE

Step-by-step optimization of CUDA SGEMM

Language:Cuda000

SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Language:Cuda000

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.0000

stf

Control and manage Android devices from your browser.

NOASSERTION000

XiangShan

Open-source high-performance RISC-V processor

Language:ScalaNOASSERTION000

XiangShan-doc

Documentation for XiangShan

Language:TeXCC-BY-4.0000