load_custom_dataset_from_folder
srashtchi opened this issue · comments
Hi Silvia
I managed to get my code running fine, thanks for your response.
I have another question , I am trying to make the code smoother, right now in order to create a dataset object I have to save my variable to a .tsv file first, and then use the load_custom_dataset_from_folder
method to load the data from .tsv into empty dataset object. without this object obviously the get_corpus()
method wouldn't do its magic. See the sample code below.
So basically the question is: is there a way to directly pass my variable to a dataset
object without saving and loading?
from octis.dataset.dataset import Dataset
f=Path('/myFolderPath/corpus.tsv')
df.to_csv(f, sep="\t", index=False, header=False, columns = ['document'])
dataset = Dataset()
dataset.load_custom_dataset_from_folder('/myFolderPath/')
texts=dataset.get_corpus()
Originally posted by @srashtchi in #68 (comment)
Is there any chance you could respond to this question?
Hello, sorry for the late reply.
If you need the dataset only for the computation of the coherence, then you can directly define the "texts" as a list of lists of strings. I.e.
texts=[['a', 'b', 'c'], ['a', 'd', 'e'], ...]
This will not require to save and load the dataset.
Let me know if this helped :)
Silvia
Thank for the quick reply. I will try this.