VS (Vui Seng Chua) (vuiseng9)

vuiseng9

User data from Github https://github.com/vuiseng9

Company:@Intel

GitHub:@vuiseng9

VS (Vui Seng Chua)'s repositories

nncf

PyTorch*-based Neural Network Compression Framework for enhanced OpenVINO™ inference

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

data-parallel-CPP

Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, Xinmin Tian (Apress, 2020).

Language:CMakeLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

EAGLE

[ICML'24] EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ipex

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, ModelScope, etc

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llm-awq

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

mlperf-inference

Reference implementations of MLPerf™ inference benchmarks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mlperf-v3.0-intel

This repository contains the results and code for the MLPerf™ Inference v3.0 benchmark.

License:Apache-2.0Stargazers:0Issues:0Issues:0

mlperf-v3.1-intel

This repository contains the results and code for the MLPerf™ Inference v3.1 benchmark.

License:Apache-2.0Stargazers:0Issues:0Issues:0

mm_amx

matmul using AMX instructions

Stargazers:0Issues:0Issues:0

oneAPI-samples

Samples for Intel® oneAPI Toolkits

License:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

optimum-intel

Accelerate inference of 🤗 Transformers with Intel optimization tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:CLicense:MITStargazers:0Issues:0Issues:0

sd-perf

quick script to profile stable diffusion performance

Language:PythonStargazers:0Issues:2Issues:0

SparseFinetuning

Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

speculative-sampling

Simple implementation of Speculative Sampling in NumPy for GPT-2.

Language:PythonStargazers:0Issues:0Issues:0

SqueezeLLM

SqueezeLLM: Dense-and-Sparse Quantization

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

torch-custom-linear

custom implementation of linear

Language:PythonStargazers:0Issues:2Issues:0

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

wanda

A simple and effective LLM pruning approach.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0