Anomalous T5 results using GPU inference on a 4090 graphics card

Question

Anomalous T5 results using GPU inference on a 4090 graphics card

taishan1994 opened this issue 2 months ago · comments

Thank you very much for your work. I'm using ctranslate2 accelerated https://huggingface.co/Maciel/T5Corrector-base-v2 reasoning and when using the cpu for inference the output is normal, but switching to using the GPU, the output is all: Response Text: {" translated_text":"..."} , where is the problem please?

BBC-Esq · Answer 1 · Fri Apr 26 2024 03:07:55 GMT+0800 (China Standard Time)

Wish I could help but it's all in Chinese...what exactly are you trying to do?

西西嘛呦 · Answer 2 · Fri Apr 26 2024 10:10:29 GMT+0800 (China Standard Time)

This is the code I tested.

ct2-transformers-converter --model T5Corrector-base-v2 --output_dir T5Corrector-base-v2-ct2  --force --quantization float16

import ctranslate2
# translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cpu")
translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cuda",  device_index=0)
input_text=""
input_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(input_text))
results = translator.translate_batch([input_tokens])

output_tokens = results[0].hypotheses[0]
output_text = tokenizer.decode(tokenizer.convert_tokens_to_ids(output_tokens))

BBC-Esq · Answer 3 · Fri Apr 26 2024 19:53:20 GMT+0800 (China Standard Time)

Sorry, thought I might help but not familiar with that model.