zalandoresearch / psgan

Periodic Spatial Generative Adversarial Networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adapting PSGAN to 1D - Questions

opened this issue · comments

Hello,

I am adapting your model to 1D data, without any global dimension (i.e. the equivalent of a single 1D texture). I am confused about periodic part of the noise tensor and the periodic layer :

Why initialize last wave parameter (the only one when no global dimensions) to 0 and 1 ?

Why are even indices on the periodic dimension initialized in a way (a matrix with all the same columns) and odd indices with the transposed matrix ? (lines 59-65 in psgan.py)

What does the periodic_affine parameter control ? (I see, in the case of no global dimensions, that there will be a mixture between even and odd parts of the periodic noise tensor controlled by additional wave parameters, but why?)

I believe in 1D this parameter would not make sense, am I correct ?
Since a line of code is worth a thousand words, see below my modification of the relevant functions (note that I am working in python3) for the case where dg = 0 and for 1D data (with channels of course).

    def _setup_wave_params(self):
        """
        Set up the parameters of the periodic dimensions
        They will be adjusted during the end-to-end training
        """

        if self.nz_periodic:

            bias2 = sharedX( self.g_init.sample( (self.nz_periodic) ))
            self.wave_params = [bias2]

            a = np.zeros(self.nz_periodic)
            self.wave_params[-1].set_value(np.float32(a))
        else:
            self.wave_params = []


    def sample_noise_tensor(self, batchsize, zx):
        """
        Calculates Z noise tensor given a config
        @param zx : spatial size
        """
        nz = self.config.nz
        nzl = self.config.nz_local
        nzp = self.config.nz_periodic

        Z = np.zeros((batchsize,nz,zx))

        # Uniform Local
        Z[:,:nzl] = rng.uniform(-1.,1., (batchsize, nzl, zx) )

        # Spatially Periodic
        if nzp > 0:

            band = np.pi * (np.arange(nzp) / nzp + 1 ) / 2
            Z[:,-nzl:] = band[:,None] * np.arange(zx)[None,:]

        return Z.astype(floatX)

And the calculation done by the periodic layer would be :

    def _wave_calculation(self,Z):

        nPeriodic = self.config.nz_periodic

        if nPeriodic == 0:
            return Z

        W = self.wave_params[0].dimshuffle('x', 0, 'x')

        band = Z[:, -nPeriodic:] * W

        band += 2 * np.pi * srng.uniform((Z.shape[0],nPeriodic)).dimshuffle(0,1,'x')

        return T.concatenate([Z[:,:-nPeriodic], T.sin(band)], axis=1)

So the periodic part of the noise tensor would span over nz_periodic channels instead of 2 * nz_periodic in the 2D case.
Let me know if you find anything wrong !

Thank you !

Hi

to answer briefly most of your questions:
PSGAN deals with 2D planar data, so planar coordinates have X and Y components.

When using affine waves, the waves can be tilted any angle towards the axes (e.g. 0.5X + 0.5Y for a 45 degree wave).
If not affine, then the waves are horizontal (1X + 0Y) and vertical (0X + 1Y), plus optionally some additional multiplicative parameter w to tune frequencies for X and Y.

In the 1D case, you do not have these X and Y, so the code will simplify a bit. Each wave will be only X*w , where w is parameters for the frequency.

I hope this helps you and wish you good luck with your project.
By the way, I am curious what kind of application do you have in mind with a 1D PSGAN. Will that be audio or time-series related?

Thank you for your clarifications !

I want to apply the PSGAN for the generation of amorphous atomic structures. These structures exhibit similar properties to textures : local correlations that are randomly and smoothly continued through space.
These can are really 3D data that can be represented with a (N,3) matrix. Setting one of these dimensions to the channel dimension leads indeed to 1D data.

I am still unsure about whether this is the best way to represent N points in 3 dimensions, as the spatial correlations are less evident than in the case of an image.
But representing N points spatially would require to quantize the 3D space, as for an image that is 2D, which would increase computation (and we would have to perform 3D convolutions in this case) and limit the achievable precision.

@massimilianocomin Hi, I think it would be better to use 3D convolution than 1D with 3 channel, too much spatial information is lost in this 1D setting.
I think what you are describing is ''solid texture'', you can find some useful details by googling it.
I've done some researchs in texture synthesis, and I'm interested in the so called ''amorphous atomic structures'', why is it related to textures? are you familiar with this topic? maybe we can have a small discussion :)