We recommend using
- For 1B to 3B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs.
- For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.
To ensure optimal response quality, and given that latency requirements are not stringent in this scenario, we recommend using a model with at least 3B parameters.