Add regex and special character support to tokeniser
bclarkson-code opened this issue · comments
bclarkson-code commented
The tokeniser should be extended to allow special characters (e.g pad tokens) and regex parsing (to e.g avoid merging across words)