Fork of Lexical Computing's unitok tokeniser that works with python3 and is usable easily as a pip module.
Licensed, as per the original, as MPL2. See LICENSE
for the full text.
Many thanks to lexical computing, Jan Pomikalek, Jan Michelfeit and Vit Suchomel for writing one of the best tokenisers out there.