CPU instead of GPU

Is there a way to do such a thing with CPU instead of GPU? I know this would be slower, but it would be a cheaper solution and would not depend on NVIDIA.

You won't be able to do that with this particular codebase (which is heavily dependent on FasterTransformer so that it can get decent latency), but this gist from @nforest shows how to use any model from Huggingface Transformers with Copilot, and Transformers supports doing CPU-only:

You should be able to pass in pretrained=Salesforce/codegen-16B-multi there and device='cpu' to run everything on the CPU instead.

I do warn you that it will be a lot slower!

Thank you!

Edit: Running off CPU makes this more accessible by moving the barrier to entry from a $3000 rig to a $1000 rig.

@TechnologyClassroom Did you tried it? How long did it take for a completion on what cpu? 😃

@1muen I didn't try it yet. I commented asking about licensing and have not heard back. (That project really deserves to be a full repo too)