MaartenGr / KeyBERT

Minimal keyword extraction with BERT

Home Page:https://MaartenGr.github.io/KeyBERT/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggesting new terms for my vocabulary

hekl opened this issue · comments

Hello Maarten,
I see that you can use a vocabulary too. I am interested in finding out the difference between my vocabulary and the terms that keybert can suggest. So, basically to trace which of my vocabulary terms fit the documents and what new terms I might have to use. Is that possible?
Henk

The vocabulary of what KeyBERT gives back are the words tokenized by the underlying tokenizer for each document. In other words, you could apply the tokenizer outside of KeyBERT to see which potential words it can return.