CUDA runtime error : out of memory
crimsonwisp opened this issue · comments
crimsonwisp commented
when i run 6B model in T4(16G ram)
it seems the model keeps allocating memory when handling requests
so it crashes soon after a few request.
is this a bug?
github-actions commented
Hello there, thanks for opening your first issue. We welcome you to the FauxPilot community!