rafelafrance / traiter

Extract information from natural history annotations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use lemmatization and parts of speech tagging for token matching

rafelafrance opened this issue · comments

commented

This will require separate regex search strategies for tokeninzation and parsing as well as using token byte strings for the searching. We're going to be slinging bytes in and out of the token matches so this willl become compute intensive. Cython?

commented

Done.