How to use Tensorrt-LLM as backend

Question

How to use Tensorrt-LLM as backend

Worromots opened this issue 3 months ago · comments

TSL commented 3 months ago

as describe in title

Yang Yong · Answer 1 · Mon May 20 2024 17:25:20 GMT+0800 (China Standard Time)

You can set save_fp in llmc to True. Then you can use trt-llm ammo to convert a naive quant engine.

TSL · Answer 2 · Wed May 22 2024 20:30:59 GMT+0800 (China Standard Time)

THX for your reply. I have set save_fp in llmc to True, and these are files saved by llmc, how can I use trt-llm ammo to convert a naive quant engine.

TSL · Answer 3 · Thu Jun 06 2024 17:14:55 GMT+0800 (China Standard Time)

remark，I need your help

Harahan · Answer 4 · Sat Jun 08 2024 05:27:33 GMT+0800 (China Standard Time)

The following process needs to modify some codes to change the default settings in TensorRT-LLM. To help users use our tool more conveniently, we are rushing an official doc page about the tool. Please wait for our news patient.