MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Home Page:https://maartengr.github.io/BERTopic/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is the updated TensorFlow 2.16.1 version conflicting with BERTopic and bertopic.representation import OpenAI?

phoshell opened this issue · comments

Hi Maarten,

I've been exploring and enjoying BERTopic and all its super powers for over a month. Thank you for creating something incredible. Strangely though, for the last few days I haven't been able to run the model due to this very challenging and strange error. I noticed that TensorFlow was recently updated and I'm thinking that might be related to the issues, but I'm a complete novice and am not sure what to look for. I hope this is actually a worthwhile flag...

Upon running various versions (with and without each and all pip install bertopic versions) I've been getting stuck on the from bertopic import BERTopic import step:

!pip install --upgrade bertopic
!pip install bertopic
!pip install sentence-transformers
from bertopic import BERTopic
from sentence_transformers import SentenceTransformer

Giving this error:

AttributeError Traceback (most recent call last)
[<ipython-input-3-8b27323f5e75>](https://localhost:8080/#) in <cell line: 5>()
      3 get_ipython().system('pip install bertopic')
      4 get_ipython().system('pip install sentence-transformers')
----> 5 from bertopic import BERTopic
      6 from sentence_transformers import SentenceTransformer
      7 
20 frames
[/usr/local/lib/python3.10/dist-packages/tf_keras/src/saving/legacy/saved_model/load_context.py](https://localhost:8080/#) in <module>
     66 
     67 
---> 68 tf.__internal__.register_load_context_function(in_load_context)
     69 
AttributeError: module 'tensorflow._api.v2.compat.v2.__internal__' has no attribute 'register_load_context_function'

Strangely a day before I started experiencing this error, I had the same issue with OpenAI upon running variations of this:

import io
import pandas as pd
from nltk.util import ngrams
import json

!pip install openai
import openai
from google.colab import userdata
openai.api_key = userdata.get('OpenAI') # add key from https://platform.openai.com/api-keys to Google Colab secrets

from bertopic.representation import MaximalMarginalRelevance, OpenAI

openai_generator = OpenAI(client)
mmr = MaximalMarginalRelevance(diversity=0.3)
representation_models = [mmr, openai_generator]

topic_model = BERTopic(representation_model=representation_models)

With the error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-33-3a0a6855a3bb> in <cell line: 6>()
      4 # Assuming OpenAI representation does not require a client object
      5 # and works directly with the `openai` module after setting the API key.
----> 6 openai_generator = OpenAI()  # If additional parameters are needed, they should be added here
      7 
      8 # Use the chained models

/usr/local/lib/python3.10/dist-packages/bertopic/_utils.py in __call__(self, *args, **kwargs)
     98 
     99     def __call__(self, *args, **kwargs):
--> 100         raise ModuleNotFoundError(self.msg)
    101 
    102 
ModuleNotFoundError: In order to use OpenAI you will need to install via; pip install openai

To debug I asked ChatGPT for help. For OpenAI, it told me "_The error message you're encountering indicates a problem with the way the OpenAI class is being imported or instantiated, not necessarily with the openai library installation itself. Given that you've confirmed the openai library is installed, the issue likely lies with how BERTopic is trying to interact with it.

From the BERTopic documentation and the code provided, it seems that the OpenAI class from BERTopic is expected to wrap around the OpenAI functionality. However, the error suggests that when you attempt to instantiate OpenAI(client), it's not recognizing it correctly, potentially due to a version mismatch or an update in the BERTopic library that isn't reflected in the documentation you're using._"

Similarly for the BERTopic errors, ChatGPT told me "The error you're encountering is somewhat unusual and seems to be related to the internals of TensorFlow, particularly after the upgrade to version 2.16.1. The error indicates an issue within TensorFlow's saved model functionality, which might not be directly related to BERTopic but rather to how TensorFlow is being utilized by other libraries you're importing." And recommended running

import tensorflow as tf
print(tf.__version__)

I am running 2.16.1 and restarted the runtime and ran !pip install --upgrade sentence-transformers bertopic before running from bertopic import BERTopic again. No matter what I've tried, I suddenly can't get the model to run.

Thank you again,
Charlotte

Let's take a step back first. In what kind of environment are you installing BERTopic? Since you mentioned using !pip install bertopic are you using Google Colab? If so, then I can run both BERTopic and OpenAI without any problems using the following installation:

!pip install bertopic openai

After installing, simply restart your environment and it should work.

If that does not work or you are not using Google Colab, start from a completely new environment and then install the required packages.

Update

I just tried it with tensorflow 2.16.1 and it seems it indeed is the source of the issue. UMAP uses tensorflow for a certain feature only when it is installed. If tensorflow is installed (which is not a requirement for BERTopic, hence working from a new env should work) is has issues with 2.16.1.

If you absolutely need tensorflow, then 2.15.0 should work.