blester125 / iobes

Tool for parsing and converting various span encoding schemes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: why are both start/end and tokens needed

johann-petrak opened this issue · comments

I just found this package and it looks very useful, but I could not find much documentation.

So when looking at the code, I was wondering: when creating a Span, why is it necessary to specify both start/end and the tuple of token indices using tokens? Does not one imply the other, i.e. tokens = tuple(range(start, end)) or start=tokens[0] and end=tokens[-1]?

I knew someone who used to use "spans with gaps" to represent complex things that had multiple parts. The tokens was basically just a way to support that (very rare lol) use case.

In general you are right, you don't need both.