MaartenGr / KeyBERT

Minimal keyword extraction with BERT

Home Page:https://MaartenGr.github.io/KeyBERT/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not able to use gensim

oloumisarah opened this issue · comments

When following the documentation on gensim to use word2vec I get the following error:

from keybert.backend import GensimBackend
import gensim.downloader as api

ft = api.load('fasttext-wiki-news-subwords-300')
ft_embedder = GensimBackend(ft)


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/tmp/ipykernel_226449/4228192336.py in <module>
----> 1 from keybert.backend import GensimBackend
      2 import gensim.downloader as api
      3 
      4 ft = api.load('fasttext-wiki-news-subwords-300')
      5 ft_embedder = GensimBackend(ft)

ImportError: cannot import name 'GensimBackend' from 'keybert.backend' (/home/sean/KeyBERT/keybert/backend/__init__.py)

does GensimBackend need to be included in init?
Also would I just pass ft_embedder as the model to KeyBERT?

Is this the full error message? It feels like something is missing here.

Also, which version of BERTopic are you using?

@MaartenGr Yep, this is all I see
Screenshot 2024-04-01 at 11 39 15 AM

I also just did pip3 install keybert[gensim] , restarted the kernel, and ran it again but the error persists. Are you able to reproduce it?

I have v 0.8.4 btw

Screenshot 2024-04-01 at 1 20 52 PM
This seemed to work but I'm not sure if its the right change:

diff --git a/keybert/backend/__init__.py b/keybert/backend/__init__.py
index a600155..de149d4 100644
--- a/keybert/backend/__init__.py
+++ b/keybert/backend/__init__.py
@@ -1,3 +1,4 @@
 from ._base import BaseEmbedder
+from ._gensim import GensimBackend

-__all__ = ["BaseEmbedder"]
+__all__ = ["BaseEmbedder","GensimBackend"]

Ah right, you do not need to import GensimBackend. If you follow along with the documentation you will notice that ft is directly passed to KeyBERT:

import gensim.downloader as api
ft = api.load('fasttext-wiki-news-subwords-300')
kw_model = KeyBERT(model=ft)

Ah OK. Thanks :) I think I may have been confused because I was looking at the comments here:

from keybert.backend import GensimBackend