sdv-dev / Copulas

A library to model multivariate data using copulas.

Home Page:https://sdv.dev/Copulas/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Very long time for sampling from an already fitted model

samuelevirgili opened this issue · comments

Hello everyone,
I have fitted a Gaussian Multivariate copula on a database that contains 4 columns (the activity of 4 neurons) with 37500 rows.
After fitting it, I have saved this model. Then, I have loaded it and wanted to use it to generate 125000 samples. The code (that I report below) did not give me any problem but has been running for 22 hours now.
I am sure that there is something that I don't understand because it cannot take this long since the model is already trained.
Any idea?

This is the code I used:

copula = GaussianMultivariate()
copula.fit(my_database) ##shape (37500,4)
copula.save(model_path)

fine so far. And then:

new_copula = GaussianMultivariate.load(model_path)
new_samples = new_copula.sample(125000)

Maybe it is to be said that before this I was able to generate 1000 samples (took 9 minutes) and 10000 samples (took 31 minutes).
I am using Ubuntu 20.04 and an environment with python 3.6.13 and copulas 0.4.0

Let me know if you need any other information.
B.t.w Thanks for developing this package, the work you have done so far is amazing!