Baukebrenninkmeijer / On-the-Generation-and-Evaluation-of-Synthetic-Tabular-Data-using-GANs

Repository for the results of my master thesis, about the generation and evaluation of synthetic data using GANs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tensorpack version

shreyanshs opened this issue · comments

Hi,

When using the latest version of tensor pack I am getting the following error,

from tensorpack import (
ImportError: cannot import name 'InputDesc' from 'tensorpack' (/Users/user/anaconda3/lib/python3.8/site-packages/tensorpack/__init__.py)

On the tensorpack GitHub page, it is mentioned that it is not stable and you must use the exact version of Tensorpack as used in a project to run it.

So could you please specify the tensor pack version you are using?

Thanks.

Hi @shreyanshs, this is not actively being maintained anymore.

If I recall correctly, I used the default TGAN versions, so their repo should be able to help you out: https://github.com/sdv-dev/TGAN. Specifically, their setup file specifies exact versions: https://github.com/sdv-dev/TGAN/blob/master/setup.py.

Let me know if it works with these versions.

In general however, I would recommend not using this and looking at the Synthetic Data Vault or their GAN based implementations like CopulaGAN and CTGAN. All of them can be found at the github of the authors of TGAN: https://github.com/sdv-dev?type=source

Hi @Baukebrenninkmeijer , I was actually trying out your repo after seeing this issue sdv-dev/CTGAN#8. For my dataset, CTGAN was not able to capture the numerical values well. The categorical features are still fine. So I thought of trying your implementation of TGAN-WGAN-GP out.

Do you suggest not to do that?

@shreyanshs Aaah ok, yeah that might actually be interesting. Let me know if you get my code running. I might still be able to help if not.

On the other hand, TGAN is not that complicated model-wise, so you could also try to change the modelling of CTGAN to use the TGAN model. Cause the TGAN code is honestly quite horrible :|.

Hi @Baukebrenninkmeijer, so I was able to make it work. But it is really slow, like on the CPU, each iteration takes about 50 mins on the census data. I was wondering if that is correct, since in all there are only 505 total categories across all discrete columns. Or am I doing something wrong?

You were using it on the GPU? And it was slow like the CPU?

Can you describe your data a bit more? I can't tell just by this description whether that is expected. But in general, if it's as slow as the CPU, that's not good. Are you sure it's using the GPU?

Hi,

Sorry for the late response. I was on a bit of a break from the work. Yeah so the numbers earlier were for the CPU. That was very slow, but on the GPU, it works fine, I mean around 10mins on the census data.

So, basically all works now. Thanks for your help!

@shreyanshs Great! Closing the issue then.
If you have any interesting results, and compared with CTGAN, let me please know. Very curious to see your results.