Robert Shaw (robertgshaw2-neuralmagic)

robertgshaw2-neuralmagic

Geek Repo

Company:@neuralmagic

Location:Boston

Twitter:@robertshaw21

Github PK Tool:Github PK Tool

Robert Shaw's repositories

vllm-k8s

Example deploying vLLM on GKE

Language:Jupyter NotebookStargazers:2Issues:1Issues:0

deepsparse-continuous-batching

DeepSparse Continuous Batching

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

llm-compressor-example

Example using llm-compressor

Language:PythonStargazers:1Issues:0Issues:0

marlin-example

Example of quantizing and saving a model with Marlin

Language:Jupyter NotebookStargazers:1Issues:1Issues:0

mistral-self-rag

Training mistral on self-rag task

Language:PythonStargazers:1Issues:1Issues:0

vllm-benchmarking

Benchmarking vLLM

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

auto-fp8

Making FP8 Checkpoints

Language:PythonStargazers:0Issues:1Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

bert-benchmarking

Repo for benchmarking bert performance under various scenarios

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

bert-server-example

DeepSparse Server Running BERT

Language:PythonStargazers:0Issues:2Issues:0

zephyr-training

Recreating and playing with zephyr

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

chat-example

Example calling chat api

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

deepsparse-llm-server-example

example for deepsparse llm in basic server

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

gptq-benchmarking

Benchmarking gptq performance and how the kernels work

Stargazers:0Issues:1Issues:0

gptq-experiments

Experiments running GPTQ

Stargazers:0Issues:1Issues:0

gptq-serialization-example

Example of gptq serialization

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

lm-evaluation-harness

A framework for few-shot evaluation of language models.

License:MITStargazers:0Issues:0Issues:0

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

nm-vllm-example

Example running nm-vllm

Language:PythonStargazers:0Issues:0Issues:0

one-shot-mpt-gsm-8k

Experiments for applying one shot

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

sparse-finetuning

Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tgi-benchmarking

Benchmarking LLMs on GPUs

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

viggo-finetuning

Example finetuning an LLM on viggo dataset

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

vllm-client

Client for benchmarking vllm

Language:PythonStargazers:0Issues:0Issues:0

vllm-examples

Example benchmarking vLLM

Language:PythonStargazers:0Issues:0Issues:0

vllm-qa-basic-correctness

Repo for basic correctness of vllm

Language:PythonStargazers:0Issues:0Issues:0