LFX Mentorship (Mar-May, 2024): Integrate Intel Extension for Transformers as a new WASI-NN backend
hydai opened this issue · comments
Summary
- Description: LLM is a hot topic. There are more and more frameworks to make the execution of LLM faster. WasmEdge already integrated the llama.cpp as one of the backend. Running LLM with CPU only is huge for those users who don't have GPU. We would like to integrate Intel Extension for Transformers as a new WASI-NN backend to provide a faster CPU inference performance.
Details
- Pre-test: To apply for this mentorship, you MUST finish a pre-test. Pre-test link: #3182
- Expected Outcome: A new plugin provides an Intel Extension for Transformers WASI-NN backend, a test suite for validating the plugin, documents, and examples for explaining how to use the plugin.
- Recommended Skills: C++, Wasm
- Mentor(s):
- Hung-Ying Tai (@hydai, hydai@secondstate.io)
- dm4 (@dm4, dm4@secondstate.io)
Appendix
- Intel Extension for Transformers: https://github.com/intel/intel-extension-for-transformers
- WASI-NN: https://github.com/second-state/wasmedge-wasi-nn
- WASI-NN llama.cpp backend(check ggml.h and ggml.cpp): https://github.com/WasmEdge/WasmEdge/tree/master/plugins/wasi_nn
It was great seeing wasmedge demo with llms in the last community meet, looking forward to contribute to this when the applications open.