bclarkson-code / Tricycle

Deep learning framework completely from scratch in python + numpy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add regex and special character support to tokeniser

bclarkson-code opened this issue · comments

The tokeniser should be extended to allow special characters (e.g pad tokens) and regex parsing (to e.g avoid merging across words)