TP=2, Loss of accuracy
coderchem opened this issue · comments
coderchem commented
hello,I carried out TP=2,multi-gpu operation on llama model of 7b, and found that the comparison accuracy of the result was lost by 5%. TP=2 as far as I know should not change the accuracy. Why?
hurun commented
hi, can you post reproduce step.
byshiue_NV commented
FasterTransformer does not support llama officially and FasterTransformer development has transitioned to TensorRT-LLM. TensorRT-LLM has supported LLaMa, please take a try.