Optimize for consumer GPU, eg 11GB or 16GB
profintegra opened this issue · comments
Profintegra commented
I'm not sure it makes sense to load more than one layer from performance standpoint, but using 1.6GB out of 11GB/16GB of typical consumer GPU is not optimal (and super slow).
I've red on haggungface that it doesn't make sense to load more layers because only one layer is evaluated. But may be we can split into bigger chunks (several layers) and it can be done so that multiple layers are evaluated at once?
I can even try doing it by myself, will be nice to get a bit of guidance like: is it even feasible to do, where in the code the optimization should happen, etc