wayveai / Driving-with-LLMs

PyTorch implementation for the paper "Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The inference generation is very slow

Alkaiddd opened this issue · comments

The inference process is currently quite slow. Are there any methods available to accelerate it?
For action task, it costs about 9s for a sample.

Hi Alkaiddd,
Thank you for your feedback! This model is not designed for real-time applications, and running inference with a 7B model does pose challenges, especially on less powerful GPUs. We have achieved inference times of 1-2 seconds per sample with batch inference on NVIDIA A100 GPUs. There's definitely room for improvement, such as quantization if you care about the inference speed.