fauxpilot / fauxpilot

FauxPilot - an open-source alternative to GitHub Copilot server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA runtime error : out of memory

crimsonwisp opened this issue · comments

when i run 6B model in T4(16G ram)
it seems the model keeps allocating memory when handling requests
so it crashes soon after a few request.
is this a bug?

Hello there, thanks for opening your first issue. We welcome you to the FauxPilot community!