Ziwei Fan (zfan20)

zfan20

Geek Repo

Company:University of Illinois at Chicago

Home Page:https://ziwei-fan.github.io/

Github PK Tool:Github PK Tool

Ziwei Fan's starred repositories

llama.cpp

LLM inference in C/C++

LLM101n

LLM101n: Let's build a Storyteller

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22247Issues:218Issues:124

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:13050Issues:90Issues:611

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonLicense:Apache-2.0Stargazers:11451Issues:382Issues:3315

Perplexica

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

Language:TypeScriptLicense:MITStargazers:11318Issues:80Issues:185

ggml

Tensor library for machine learning

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilogStargazers:6734Issues:65Issues:22

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4633Issues:51Issues:272

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonLicense:Apache-2.0Stargazers:2741Issues:43Issues:23

reflexion

[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

Language:PythonLicense:MITStargazers:2181Issues:31Issues:31

c-style

My favorite C programming practices.

ThunderKittens

Tile primitives for speedy kernels

Language:CudaLicense:MITStargazers:1409Issues:25Issues:20

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonLicense:Apache-2.0Stargazers:1280Issues:17Issues:46

distributed-llama

Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.

Language:C++License:MITStargazers:1082Issues:19Issues:46

granite-code-models

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:676Issues:37Issues:255

unet.cu

UNet diffusion model in pure CUDA

Language:CudaStargazers:537Issues:2Issues:0

RepoAgent

An LLM-powered repository agent designed to assist developers and teams in generating documentation and understanding repositories quickly.

Language:PythonLicense:Apache-2.0Stargazers:238Issues:9Issues:23

Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

awesome-emulators-simulators

A curated list of software emulators and simulators of PCs, home computers, mainframes, consoles, robots and much more...

CEPE

[ACL 2024] Long-Context Language Modeling with Parallel Encodings

Language:PythonLicense:MITStargazers:118Issues:5Issues:5

Awesome-Mainframes

Awesome list of mainframe related resources & projects

bark.cpp

Port of Suno AI's Bark in C/C++ for fast inference

Language:C++License:MITStargazers:48Issues:0Issues:0

PLTranslationEmpirical

Artifact repository for the paper "Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code", In Proceedings of The 46th IEEE/ACM International Conference on Software Engineering (ICSE 2024), Lisbon, Portugal, April 2024

Language:PythonLicense:MITStargazers:36Issues:2Issues:1

llama_duo

asynchronous/distributed speculative evaluation for llama3

Language:C++License:MITStargazers:32Issues:2Issues:1
Language:PythonLicense:Apache-2.0Stargazers:11Issues:1Issues:0