juliasilge / tidytext

Text mining using tidy tools :sparkles::page_facing_up::sparkles:

Home Page:https://juliasilge.github.io/tidytext/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Replacement of `token = "tweets"`

NajaMLindelof opened this issue · comments

Hello!

Firstly, thanks a lot for creating and maintaining this package! It is really great to work with.

I have been using the token = "tweets" Twitter tokenizer quite a lot - do you have a recommended replacement of its function (mainly retaining hasthtags, usernames, URLs etc) after its #227 tidytext 0.4.0 removal?

Thanks a lot!

I would go back to using a regular expression with token = "regex", which is what we did before the token = "tweets" option was available. Take a look at this change in our book, where we replaced a regex with token = "tweets", and make the opposite change where you go back to using the regular expression.

Brilliant, thanks!

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.