MaartenGr / KeyBERT

Minimal keyword extraction with BERT

Home Page:https://MaartenGr.github.io/KeyBERT/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does kerbert going to support LLaMA?

thtang opened this issue · comments

Hi, I received an error once I change the model with decapoda-research/llama-7b-hf. Is this error derived from sentence-transformer?

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as
pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via
tokenizer.add_special_tokens({'pad_token': '[PAD]'}).

Thanks for sharing. The model that you gave KeyBERT is meant for creating embeddings and not for performing the keyword search itself. It should be possible to integrate it within KeyBERT but since its procedure is quite different from how KeyBERT works, many parameters would not have an effect, such as use_mmr, use_maxsum, vectorizer, doc_embeddings, word_embeddings, etc.