Use any small models, like LaMini, Flan-T5-783M etc.
gitknu opened this issue · comments
Hello, sorry for the fact I couldn't find the solution in issues and if the question is dumb, but looking for the answer and trying by myself didn't give the result.
Details:
The problem is I have low-end PC which is capable of running Alpaca and Vicuna (both 7B), but quite slowly. On the other hand, trying different models I saw that models under 1B parameters run quite well. Mainly they are based on Flan-T5. They give good results as for my machine and quickly enough (about 3-5 tokens per second). Using it with text is another better point. For example, asking it "basing on this text, answer -..." I have almost perfect answer. But giving it text each time is bad practice as for me. I mean, time spend etc.
Short question:
Is there any way to use this tool with any of these models?
- LaMini-Flan-T5-783M
- Flan-T5-Alpaca (770M or something)
- RWKV (under 1.5B)
- (any other good small models, under 1B parameters)
If you give the detailed manual I will be very grateful! Solutions, other than privateGPT etc. are also welcome!
Thank you for understanding, answers and sorry for any inconvenience!