Get .terms() but keep hyphenated strings (similar to .hyphenated() )
PuneetKohli opened this issue · comments
Puneet Kohli commented
Is there a way to achieve this?
spencer kelly commented
hey Puneet, good question:
Little weird, but you could do .splitAfter('!@hasHyphen')
, like this:
https://runkit.com/spencermountain/659822ebdfb7e500085838fd
Alternatively, you could shim-in a custom tokenizer, like:
nlp.world().methods.one.tokenize.splitTerms = function (str) {
return str.split(/ /)
}
nlp('one two-three four five').debug()
// one, two-three, four, five
that one is obviously simplified, but let me know if you'd like some more help.
cheers