Anomalous T5 results using GPU inference on a 4090 graphics card
taishan1994 opened this issue · comments
Thank you very much for your work. I'm using ctranslate2 accelerated https://huggingface.co/Maciel/T5Corrector-base-v2 reasoning and when using the cpu for inference the output is normal, but switching to using the GPU, the output is all: Response Text: {" translated_text":"..."} , where is the problem please?
Wish I could help but it's all in Chinese...what exactly are you trying to do?
This is the code I tested.
ct2-transformers-converter --model T5Corrector-base-v2 --output_dir T5Corrector-base-v2-ct2 --force --quantization float16
import ctranslate2
# translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cpu")
translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cuda", device_index=0)
input_text=""
input_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(input_text))
results = translator.translate_batch([input_tokens])
output_tokens = results[0].hypotheses[0]
output_text = tokenizer.decode(tokenizer.convert_tokens_to_ids(output_tokens))
Sorry, thought I might help but not familiar with that model.