Henson (zccyman)

zccyman

Geek Repo

Company:wondertek

Location:Shanghai,China

Github PK Tool:Github PK Tool

Henson's starred repositories

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38111Issues:379Issues:1580

interview

📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

Language:C++License:NOASSERTIONStargazers:33417Issues:869Issues:62

OpenDevin

🐚 OpenDevin: Code Less, Make More

Language:PythonLicense:MITStargazers:26755Issues:282Issues:964

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

mojo

The Mojo Programming Language

Language:MojoLicense:NOASSERTIONStargazers:21789Issues:262Issues:1784

models

A collection of pre-trained, state-of-the-art models in the ONNX format

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7325Issues:184Issues:387

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:3935Issues:34Issues:421

google-10000-english

This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.

iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Language:C++License:Apache-2.0Stargazers:2377Issues:84Issues:3522

TechCPP

【C++面试&C++学习指南】 这里整理了C++后端研发工程师面试和工作必备的知识点 。

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language:PythonLicense:Apache-2.0Stargazers:2018Issues:34Issues:186

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonLicense:Apache-2.0Stargazers:1996Issues:27Issues:143

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:1508Issues:32Issues:216

mnn-llm

llm deploy project based mnn.

Language:C++License:Apache-2.0Stargazers:1335Issues:26Issues:174

torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Language:C++License:NOASSERTIONStargazers:1211Issues:249Issues:622

onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

Language:C++License:Apache-2.0Stargazers:695Issues:37Issues:674

InferLLM

a lightweight LLM model inference framework

Language:C++License:Apache-2.0Stargazers:639Issues:11Issues:54

quanto

A pytorch Quantization Toolkit

Language:PythonLicense:Apache-2.0Stargazers:613Issues:8Issues:65

tpu-mlir

Machine learning compiler based on MLIR for Sophgo TPU.

Language:C++License:NOASSERTIONStargazers:489Issues:20Issues:85

CUDA-Learn-Note

🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Language:CudaLicense:GPL-3.0Stargazers:435Issues:6Issues:0

QPyTorch

Low Precision Arithmetic Simulation in PyTorch

Language:PythonLicense:MITStargazers:254Issues:12Issues:51

pymlir

Python interface for MLIR - the Multi-Level Intermediate Representation

Language:PythonLicense:BSD-3-ClauseStargazers:188Issues:13Issues:16

PTQ4ViT

Post-Training Quantization for Vision transformers.

LLM-FP4

The official implementation of the EMNLP 2023 paper LLM-FP4

Language:PythonLicense:MITStargazers:142Issues:5Issues:9
Language:PythonLicense:BSD-3-Clause-ClearStargazers:96Issues:5Issues:5

nelli

A lightweight, Pythonic, frontend for MLIR

Language:C++License:Apache-2.0Stargazers:77Issues:5Issues:1

SHARK-Turbine

Unified compiler/runtime for interfacing with PyTorch Dynamo.

Language:PythonLicense:Apache-2.0Stargazers:77Issues:28Issues:439

INT-FP-QSim

Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.

Language:PythonLicense:Apache-2.0Stargazers:41Issues:6Issues:0

examples

A set of examples around MegEngine

Language:PythonLicense:Apache-2.0Stargazers:27Issues:2Issues:5