Translation of pre-split and pre-tokenized sentences
BLKSerene opened this issue · comments
Ye Lei (叶磊) commented
Hi, the doc says Argos Translate uses SentencePiece
(and maybe Sacremoses
?) for tokenization and Stanza
for sentence boundary detection. I'm wondering whether it is possible to translate pre-split and pre-tokenized sentences (a list of lists of tokens), in which case I could drop many dependencies of Argos Translate, since there are many problems concerning the strict version pin of dependencies (cf. #362, #395).