WasmEdge / WasmEdge

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices, smart contracts, and IoT devices.

Home Page:https://WasmEdge.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LFX Mentorship (Mar-May, 2024): Integrate Intel Extension for Transformers as a new WASI-NN backend

hydai opened this issue · comments

Summary

  • Description: LLM is a hot topic. There are more and more frameworks to make the execution of LLM faster. WasmEdge already integrated the llama.cpp as one of the backend. Running LLM with CPU only is huge for those users who don't have GPU. We would like to integrate Intel Extension for Transformers as a new WASI-NN backend to provide a faster CPU inference performance.

Details

  • Pre-test: To apply for this mentorship, you MUST finish a pre-test. Pre-test link: #3182
  • Expected Outcome: A new plugin provides an Intel Extension for Transformers WASI-NN backend, a test suite for validating the plugin, documents, and examples for explaining how to use the plugin.
  • Recommended Skills: C++, Wasm
  • Mentor(s):

Appendix

  1. Intel Extension for Transformers: https://github.com/intel/intel-extension-for-transformers
  2. WASI-NN: https://github.com/second-state/wasmedge-wasi-nn
  3. WASI-NN llama.cpp backend(check ggml.h and ggml.cpp): https://github.com/WasmEdge/WasmEdge/tree/master/plugins/wasi_nn

It was great seeing wasmedge demo with llms in the last community meet, looking forward to contribute to this when the applications open.

commented

Fixed by #3260