REPP versions

Question

REPP versions

arademaker opened this issue 2 years ago · comments

Alexandre Rademaker commented 2 years ago

What is the relation between this code and the code from Woodley?

https://github.com/delph-in/homebrew-delphin/blob/HEAD/Formula/repp.rb#L4

I know from https://github.com/delph-in/docs/wiki/ReppTop that this code from Woodley and the @goodmami implementation at https://pydelphin.readthedocs.io/en/latest/api/delphin.repp.html are alternative implementations. Are all of these 100% compatible?

Michael Wayne Goodman · Answer 1 · Sat Aug 13 2022 10:22:54 GMT+0800 (China Standard Time)

Woodley's version is what's used in ACE, and I believe it predates this implementation a little. This repo is the code used for PET and for the standalone repp command (which is currently used in the NLTK's nltk.tokenize.repp module). Two other implementations include PyDelphin and the LKB's (which probably should be listed in the ReppTop wiki's "Implementations" section, even though it's mentioned elsewhere in the doc).

They are mostly compatible. The main differences are masking support and characterization (start/stop indices of tokens). This repo and Woodley's repp-0.2.2 release do not include masking, but Woodley has an unreleased version of his implementation with masking support that is used in recent versions of ACE. The LKB and PyDelphin both have masking support. And where PyDelphin follows this repo's characterization behavior exactly, Woodley's code, last I checked, outputs different characterization in some cases. I don't recall what the LKB does for characterization.