why cpu better ?
linzehua opened this issue · comments
If cpu inference meets the requirements (e.g. latency), and we do not aim to do batch inference, cpu would be preferable as it would be cheaper and less overhead. Otherwise, we may go with GPU inference (especially in the case of batch inference).