How to use Tensorrt-LLM as backend
Worromots opened this issue · comments
TSL commented
as describe in title
Yang Yong commented
You can set save_fp in llmc to True. Then you can use trt-llm ammo to convert a naive quant engine.
TSL commented
TSL commented
remark,I need your help
Harahan commented
The following process needs to modify some codes to change the default settings in TensorRT-LLM. To help users use our tool more conveniently, we are rushing an official doc page about the tool. Please wait for our news patient.