moyix / fauxpilot

FauxPilot - an open-source GitHub Copilot server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CPU instead of GPU

TechnologyClassroom opened this issue · comments

Is there a way to do such a thing with CPU instead of GPU? I know this would be slower, but it would be a cheaper solution and would not depend on NVIDIA.

You won't be able to do that with this particular codebase (which is heavily dependent on FasterTransformer so that it can get decent latency), but this gist from @nforest shows how to use any model from Huggingface Transformers with Copilot, and Transformers supports doing CPU-only:

https://gist.github.com/nforest/d1432b917468f5ad24b83954c98e67b1

You should be able to pass in pretrained=Salesforce/codegen-16B-multi there and device='cpu' to run everything on the CPU instead.

I do warn you that it will be a lot slower!

Thank you!

Edit: Running off CPU makes this more accessible by moving the barrier to entry from a $3000 rig to a $1000 rig.

@TechnologyClassroom Did you tried it? How long did it take for a completion on what cpu? 😃

@1muen I didn't try it yet. I commented asking about licensing and have not heard back. (That project really deserves to be a full repo too)