[EPIC] Model support dashboard (v2)

Question

[EPIC] Model support dashboard (v2)

mudler opened this issue 2 years ago · comments

Ettore Di Giacinto commented 2 years ago

This epic is a major tracker for all the backends additions that should be part of LocalAI v2 and ongoing efforts.

The objective is to release a v2 which deprecates old models which are now superseded, plus adding a new set. In order to achieve this my idea is to clean up the current state and start pinning dependencies for all the backends which requires specific environment settings (python-based ones).

Some backends will be deprecated as superseded, as such some repositories will be archived (TBD yet).

Backends:

LocalAI [bot] · Answer 1 · Mon Oct 02 2023 20:02:39 GMT+0800 (China Standard Time)

⚠️⚠️⚠️⚠️⚠️

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful 😸 and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me,
I don't support (yet) replying!

⚠️⚠️⚠️⚠️⚠️

Issue description:
The issue is related to the LocalAI project and its implementation of the Image endpoint. The user wants to create a cute baby sea otter image with the given prompt, size, and number of images. However, the code provided in the readInput function is not able to parse the input correctly and returns an error.

Proposed solution:
The issue seems to be related to the parsing of the input parameters. The user can try modifying the code in the readInput function to handle the specific format of the prompt, size, and number of images provided by the OpenAI API. Alternatively, they can try using a different API that provides more flexibility in the input format or use a different tool to generate the image.

If the issue persists, the user can also check if there are any updates available for the LocalAI project or report the issue on the GitHub repository to get help from the community.

Sources:

Bowen · Answer 2 · Sat Oct 14 2023 18:11:40 GMT+0800 (China Standard Time)

I am thinking here, Should we need to have a outline likes below:

C++
- llama.cpp series LLM, like: Qwen, chatglm)
Python
- [ ]Huggingface
- autoGPT
Rust
- rustformers
- candle

If we keep this in mind, we can split the model to the specific backend. For example, I saw the Qwen LLM mentioned their C++ implementation of it "working in the same way as llama.cpp". So, maybe it can be loaded by using llama.cpp. It should be compatible with C++ our backend.

I also suggest to add some labels are related to the series of the backends. We can know a new model can be compatible by our backends.

Ettore Di Giacinto · Answer 3 · Sat Nov 04 2023 22:35:07 GMT+0800 (China Standard Time)

conda branch was merged in #1144 . I'm looking now into make the llama.cpp backend on par with llama-go and also add llava support to it.

I'm going to refactor and re-layout things in the new backend directory too

Bowen · Answer 4 · Sun Nov 05 2023 10:34:31 GMT+0800 (China Standard Time)

@mudler thank you for mentioning. Here are some questions(second, and third one) may need you help #1180 (comment). Here what I am thinking is that we can use a tiny model to test the Rust backend features. And make sure everything ok. Maybe we can merge it.

And if everything ok. We can add other LLMs. I have plan to support Llama2(60% finished but it still has an issue), whisper, and also support onnex format.

Ettore Di Giacinto · Answer 5 · Mon Nov 13 2023 23:14:18 GMT+0800 (China Standard Time)

Breaking re-layout PR: #1279

Erich Schubert · Answer 6 · Mon Nov 27 2023 23:24:05 GMT+0800 (China Standard Time)

caching/preloading of transformer and similar models, these are currently automatically loaded on startup into /root/.cache/huggingface/. It seems to be enough to set TRANSFORMERS_CACHE in the environment to the models folder, so maybe this only requires a documentation addition.

lc · Answer 7 · Wed Dec 27 2023 12:23:20 GMT+0800 (China Standard Time)

I may start looking into #1273 while this progresses. What do you think ?

Ettore Di Giacinto · Answer 8 · Wed Dec 27 2023 15:36:10 GMT+0800 (China Standard Time)

I may start looking into #1273 while this progresses. What do you think ?

please feel free to go ahead, there are many pieces involved in here, any help is more than appreciated 👍

Ettore Di Giacinto · Answer 9 · Tue Mar 05 2024 01:48:03 GMT+0800 (China Standard Time)

caching/preloading of transformer and similar models, these are currently automatically loaded on startup into /root/.cache/huggingface/. It seems to be enough to set TRANSFORMERS_CACHE in the environment to the models folder, so maybe this only requires a documentation addition.

in #1746 I'm taking care of automatically binding the HF cache variables to the models directories if not set already