MaartenGr / KeyBERT

Minimal keyword extraction with BERT

Home Page:https://MaartenGr.github.io/KeyBERT/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Keywords are all lowercase

secsilm opened this issue · comments

Hi, great project first.

I would prefer to keep the original form instead of all lowercase. How can I do that? Thanks.

That is a result of the underlying tokenizer. You can find more about that here.

That is a result of the underlying tokenizer. You can find more about that here.

OK, I found it. It's a CountVectorizer config. Thanks.