Use any small models, like LaMini, Flan-T5-783M etc.

Question

Use any small models, like LaMini, Flan-T5-783M etc.

gitknu opened this issue a year ago · comments

Hello, sorry for the fact I couldn't find the solution in issues and if the question is dumb, but looking for the answer and trying by myself didn't give the result.

Details:
The problem is I have low-end PC which is capable of running Alpaca and Vicuna (both 7B), but quite slowly. On the other hand, trying different models I saw that models under 1B parameters run quite well. Mainly they are based on Flan-T5. They give good results as for my machine and quickly enough (about 3-5 tokens per second). Using it with text is another better point. For example, asking it "basing on this text, answer -..." I have almost perfect answer. But giving it text each time is bad practice as for me. I mean, time spend etc.

Short question:
Is there any way to use this tool with any of these models?

LaMini-Flan-T5-783M
Flan-T5-Alpaca (770M or something)
RWKV (under 1.5B)
(any other good small models, under 1B parameters)
If you give the detailed manual I will be very grateful! Solutions, other than privateGPT etc. are also welcome!

Thank you for understanding, answers and sorry for any inconvenience!