NIHOPA / NLPre

Python library for Natural Language Preprocessing (NLPre)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Standard Input Variables For Classes

HarryBaker opened this issue · comments

Most of the classes take 'doc' as their input variable. However, a couple of them take 'text' or 'org_doc' instead. Since they all take the same kind of input (a document that is a string), I think we should standardize the input variable to reduce confusion.

Sure, that's an easy one to fix. Let's use text instead of doc since the input could be a a few characters to a long paragraph.

Ok. Do you want to use a different variable for unidecoder, to indicate it's not a string like the other classes?

Sure, call it unicode_text. It may be useful to have it raise and Exception if a standard string is input. Note that this can only happen in python 2, since all strings are unicode in python 3.