explosion / tokenizations

Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

get_original_spans. A simple example with an error

LaurentBie opened this issue · comments

text = 'A ´'
tokens = ['A', '´']
spans = get_original_spans(tokens, text)

for token, span in zip(tokens, spans):
print("Token:", token, '->', text[span[0]: span[1]])

Hi, thanks for bug reporting!
This bug was resolved in textspan.get_original_spans, and tokenizations.get_original_spans will be deprecated in future. Please use textspan.get_original_spans instead.

get_original_spans was deprecated since 0.7.1, which I've just published.