Error get.vocabulary in optimizing results

Question

Error get.vocabulary in optimizing results

PhDPyBoss opened this issue a year ago · comments

OCTIS version: newest (?), the one in the google collab notebook
Python version: Google Collab version
Operating System: Windows 10

Description

Hi all,
I am trying to run the notebook on my dataset of Twitter bio's (CSV file).
However in the optimizing part I get this error and I don't know how to fix this as I do not really udnerstand what is going on.
Can you help me?
Thanks!

What I Did

I just followed the notebook, except for using my own dataset.

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
```Current call:  0

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

[<ipython-input-9-9e637ce072a4>](https://localhost:8080/#) in <module>
      1 optimizer=Optimizer()
----> 2 optimization_result = optimizer.optimize(
      3     model, documents, npmi, search_space, number_of_call=optimization_runs,
      4     model_runs=model_runs, save_models=True,
      5     extra_metrics=None, # to keep track of other metrics

3 frames

[/usr/local/lib/python3.8/dist-packages/octis/models/CTM.py](https://localhost:8080/#) in train_model(self, dataset, hyperparameters, top_words)
     98 
     99         self.set_params(hyperparameters)
--> 100         self.vocab = dataset.get_vocabulary()
    101         self.set_seed(seed=self.hyperparameters['seed'])
    102 

AttributeError: 'list' object has no attribute 'get_vocabulary'

Silvia Terragni · Answer 1 · Sat Apr 15 2023 22:13:05 GMT+0800 (China Standard Time)

Hi, when you load a custom dataset, you should specify a folder where a corpus.csv file and a vocabulary.txt file are. I suspect that it didn't load the vocabulary file correctly (if it exists). Could you check that? See https://github.com/MIND-Lab/OCTIS#load-a-custom-dataset

Thanks,

Silvia