Wei (w32zhong)

w32zhong

Geek Repo

Github PK Tool:Github PK Tool


Organizations
approach0
t-k-cloud

Wei's repositories

llm.c

LLM training in simple, raw C/CUDA

Language:CStargazers:1Issues:0Issues:0
Language:JavaScriptLicense:MITStargazers:1Issues:1Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Language:CudaLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

CS-Drafting

Cascade Speculative Drafting

Language:PythonStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

EAGLE

EAGLE: Lossless Acceleration of LLM Decoding by Feature Extrapolation

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

filebrowser

📂 Web File Browser

License:Apache-2.0Stargazers:0Issues:0Issues:0

laser

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

marker

Convert PDF to markdown quickly with high accuracy

License:GPL-3.0Stargazers:0Issues:0Issues:0

MCSD

Multi-Candidate Speculative Decoding

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

License:MITStargazers:0Issues:0Issues:0

Ouroboros

Ouroboros: Speculative Decoding with Large Model Enhanced Drafting

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Progressive-Hint

This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"

Language:PythonStargazers:0Issues:0Issues:0

pytorch-that-I-successfully-built

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

rerope

Rectified Rotary Position Embeddings

Stargazers:0Issues:0Issues:0

search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

Language:TypeScriptLicense:Apache-2.0Stargazers:0Issues:0Issues:0

SeeAct

SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Sequoia

scalable and robust tree-based speculative decoding algorithm

Language:PythonStargazers:0Issues:0Issues:0

tkblog

my blog.

Language:PHPStargazers:0Issues:1Issues:0

toolformer-pytorch

Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vivado-risc-v

Xilinx Vivado block designs for FPGA RISC-V SoC running Debian Linux distro

Language:TclStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0