SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Home Page:https://scisharp.github.io/LLamaSharp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Roadmap to v1.0.0

AsakusaRinne opened this issue · comments

Hi all, thanks to the community effort, LLamaSharp has had much richer features than the beginning. Meanwhile, the distributions of backend package may be changed soon. Therefore I think it's time to publish v1.0.0 in the next several weeks. In this issue, we discuss the roadmap to v1.0.0 and list the TODO-items. Any ideas about it is welcome here.

Required ones:

  • Include libraries of different avx supports in backend packages. #316
  • Update the documentations. #267 .
  • Enhance the text-completion APIs #239
  • Major BUG fix #273 #265 #260 #118
  • Improve local document search support. #289 #305

possible ones:

  • Add benchmark test for LLamaSharp. #237
  • Support .NET 8 since it's going to come out.
  • Add android backend.

I'm quite hesitant that if we should refactor the executors to support high-level batch decoding APIs in v1.0.0 because there's really a big amount of works. It's very useful for developing a more efficient service, but may delay our release for some weeks. I prefer to include it if the distribution of our backend package does not have significant changes in the next patch release.

I wouldn't expect a batch-decoding executor any time soon, it's going to be a lot of work to design the API to support all of the various features in an easy to use way. So yeah I'd agree we shouldn't wait for it.

I would add support for stop sequence in Kernel Memory and "query your documents" feature, otherwise I am afraid is almost an unusable feature, which is bad, because at least from a company perspective, querying your own documents is the most useful thing rather than asking general questiosn to an AI.
What do you guys think?

I would add support for stop sequence in Kernel Memory and "query your documents" feature, otherwise I am afraid is almost an unusable feature, which is bad, because at least from a company perspective, querying your own documents is the most useful thing rather than asking general questiosn to an AI. What do you guys think?

Makes sense to me. Would you like to work on it? If your time is limited, it's okay to open an issue to leave it to us. :)

I would add support for stop sequence in Kernel Memory and "query your documents" feature, otherwise I am afraid is almost an unusable feature, which is bad, because at least from a company perspective, querying your own documents is the most useful thing rather than asking general questiosn to an AI. What do you guys think?

Makes sense to me. Would you like to work on it? If your time is limited, it's okay to open an issue to leave it to us. :)

Added issue #289 .

Unfortunately I would not even know where to start :)

commented

Please add a clearer example project in future. It's really difficult as a newbie to get Llamasharp to work at all. I'm about to give up on it, I tried the discord but it was fruitless.

Sure, we will. Could you please give some specific suggestions, such as a certain example that confused you?

android backend please. I got the android binaries for llamacpp, but doesnt work with unity android project...

I am a developer from China who hopes to support more large models in China, such as Yi-32B-200k, Qwen-72b chat, etc

LLamaSharp actually should have pretty good Chinese support. @AsakusaRinne has done a lot of work with encoding for Chinese models, and I did some work on detokenisation for complex characters (which should help with any language that has multi-codepoint characters).

If there are models that llama.cpp doesn't support I'd suggest opening an issue on llama.cpp requesting that specific model and then also opening an issue here referencing that issue. Once llama.cpp supports a model we'll support it too!

Version 0.8.1 deployed normally under Unity via NuGet packages. Version 0.9.1 is deployed with errors. Errors occur when installing Semantic Kernel and Kernel Memory. If possible, test the deployment. Most likely the problem is not too significant and it is better to fix it sooner. So that it does not cause accumulating problems with errors.

@Xsanf could you open an issue with the errors you're seeing? Unity isn't one of our test targets, but if there are small tweaks we can make to improve compatibility we can certainly do that!

Unfortunately I have no experience with NuGet. Used it for the first time in this project. I installed according to the instructions https://github.com/eublefar/LLAMASharpUnityDemo

You don't even need to run the example itself. The problem occurs when importing NuGet packages. The LLamaSharp.kernel-memory.0.9.1 package imports only one LLamaSharp.kernel-memory.nuspec file. As far as I understand, the libraries themselves are not installed. There are no error messages. The LLamaSharp.semantic-kernel.0.9.1 package installs completely, but produces over 200 compilation errors. As far as I can understand, they arise due to a conflict with existing standard packages or due to a mismatch in the compilation target. There is no point in citing the errors themselves. You will see them when you try to install the package.

I'm guessing there's probably some kind of configuration error with the NuGet package.

Versions 0.7.1 and 0.8.1 were installed without errors and the example worked fine. So the error appeared only in version 0.9.1.

I'm wondering if there are plans to make llama.cpp compatible with multi-modal input (images) for use with models like llava.

@dcostea

I'm wondering if there are plans to make llama.cpp compatible with multi-modal input (images) for use with models like llava.

#609 includes llava support on InteractiveExectutor

@dcostea

I'm wondering if there are plans to make llama.cpp compatible with multi-modal input (images) for use with models like llava.

#609 includes llava support on InteractiveExectutor

I just can't wait to see it in the next release!