OpenNMT / CTranslate2

Fast inference engine for Transformer models

Home Page:https://opennmt.net/CTranslate2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Anomalous T5 results using GPU inference on a 4090 graphics card

taishan1994 opened this issue · comments

Thank you very much for your work. I'm using ctranslate2 accelerated https://huggingface.co/Maciel/T5Corrector-base-v2 reasoning and when using the cpu for inference the output is normal, but switching to using the GPU, the output is all: Response Text: {" translated_text":"..."} , where is the problem please?

Wish I could help but it's all in Chinese...what exactly are you trying to do?

This is the code I tested.

ct2-transformers-converter --model T5Corrector-base-v2 --output_dir T5Corrector-base-v2-ct2  --force --quantization float16

import ctranslate2
# translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cpu")
translator = ctranslate2.Translator("T5Corrector-base-v2-ct2", device="cuda",  device_index=0)
input_text=""
input_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(input_text))
results = translator.translate_batch([input_tokens])

output_tokens = results[0].hypotheses[0]
output_text = tokenizer.decode(tokenizer.convert_tokens_to_ids(output_tokens))

Sorry, thought I might help but not familiar with that model.