WasmEdge / WasmEdge

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices, smart contracts, and IoT devices.

https://WasmEdge.org

LFX Mentorship (Mar-May, 2024): Integrate Intel Extension for Transformers as a new WASI-NN backend

hydai opened this issue 6 months ago · comments

hydai commented 6 months ago

Summary

Description: LLM is a hot topic. There are more and more frameworks to make the execution of LLM faster. WasmEdge already integrated the llama.cpp as one of the backend. Running LLM with CPU only is huge for those users who don't have GPU. We would like to integrate Intel Extension for Transformers as a new WASI-NN backend to provide a faster CPU inference performance.

Details

Pre-test: To apply for this mentorship, you MUST finish a pre-test. Pre-test link: #3182
Expected Outcome: A new plugin provides an Intel Extension for Transformers WASI-NN backend, a test suite for validating the plugin, documents, and examples for explaining how to use the plugin.
Recommended Skills: C++, Wasm
Mentor(s):
- Hung-Ying Tai (@hydai, hydai@secondstate.io)
- dm4 (@dm4, dm4@secondstate.io)

Appendix

Intel Extension for Transformers: https://github.com/intel/intel-extension-for-transformers
WASI-NN: https://github.com/second-state/wasmedge-wasi-nn
WASI-NN llama.cpp backend(check ggml.h and ggml.cpp): https://github.com/WasmEdge/WasmEdge/tree/master/plugins/wasi_nn

Dhruv Jain commented 6 months ago

It was great seeing wasmedge demo with llms in the last community meet, looking forward to contribute to this when the applications open.

hydai commented 22 days ago

Fixed by #3260