Beast code in Giters

Henson's starred repositories

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.038111 379 1580

📚 C/C++ 技术面试基础知识总结，包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

Language:C++NOASSERTION33417 869 62

OpenDevin

🐚 OpenDevin: Code Less, Make More

Language:PythonMIT26755 282 964

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

NOASSERTION26191 595 71229

mojo

The Mojo Programming Language

Language:MojoNOASSERTION21789 262 1784

models

A collection of pre-trained, state-of-the-art models in the ONNX format

Language:Jupyter NotebookApache-2.07325 184 387

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonMIT3935 34 421

transformer-debugger

Language:PythonMIT3918 25 13

google-10000-english

This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.

NOASSERTION3812 107 25

iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Language:C++Apache-2.02377 84 3522

TechCPP

【C++面试&C++学习指南】这里整理了C++后端研发工程师面试和工作必备的知识点。

2043 27 2

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language:PythonApache-2.02018 34 186

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonApache-2.01996 27 143

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonApache-2.01508 32 216