musyoku / chainer-gqn

Neural scene representation and rendering (GQN)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Path

mjchen611 opened this issue · comments

Hi,

In the file generative_query_network/run/shepard_matzler/create_dataset.py

should I add the path of 'three.cpython-36m-x86_64-linux-gnu.so' and 'imgplot.cpython-36m-x86_64-linux-gnu.so' to the blank of follwing sentence?
sys.path.append(os.path.join("..", ".."))

Can you give me an example?

Thank you,
Mingjia

There always has an error
ModuleNotFoundError: No module named 'gqn'

or

imgplot.so not found.
Please build imgplot before running your code.
cd imgplot
make

three.so not found.
Please build three before running your code.
cd three
make

But I have compile these two file in the (definite) path
/generative-query-network-master/generative_query_network/gqn/

I guess it's no need to add
I didn't add that path, I run the create_datases.py well.

Can you give me an example to run this code in Ubuntn?
I am not familiar with the system of Ubuntn.
Thank you,
Mingjia

cd generative_query_network/run/shepard_matzler
python3 create_dataset.py --image-size 64 -per-file 10000 -total 2000000

make sure you are in the shepard_matzler directory

I am sure I am in the shepard_matzler directory

but there still has the following problem

imgplot.so not found.
Please build imgplot before running your code.
cd imgplot
make

three.so not found.
Please build three before running your code.
cd three
make

these two files imgplot.so and three.so in the generative_query_network/gqn/ directory, right?

yes

mj@mj:/media/mj/E390-1F3B/E/3_researchWork5/GQN/generative-query-network-master/generative_query_network/run/shepard_matzler$ python3.6 create_dataset.py

imgplot.so not found.
Please build imgplot before running your code.
cd imgplot
make

three.so not found.
Please build three before running your code.
cd three
make

Traceback (most recent call last):
File "create_dataset.py", line 106, in
main()
File "create_dataset.py", line 15, in main
camera = gqn.three.PerspectiveCamera(
AttributeError: 'NoneType' object has no attribute 'PerspectiveCamera'

Please try deleting three.so and imgplot.so.

It is working now,
thank you for your kind assistance

What are the differences between train.py and train_mn.py?

train_mn.py supports ChainerMN, so you can use multiple GPUs.

Hi,

I am a little curious about the demo
https://gfycat.com/gifs/detail/WanAnimatedGull

How long does it take to train the model? And how many iterations?

Thank you,
Mingjia

My current training progress:
https://thumbs.gfycat.com/MiniatureShockedDouglasfirbarkbeetle-size_restricted.gif

Command:

mpirun -np 16 python3 train_mn.py --dataset-path /home/musyoku/dataset/ --snapshot-path results -b 64 -mu-i 0.0001 -mu-f 0.00001 -ps-i 2.0 -ps-f 0.7 -pn 1000000

Log:

---------------------------  --------
image_size                   (64, 64)
chrz_size                    (16, 16)
channels_r                   256
channels_chz                 64
inference_channels_map_x     32
inference_share_core         False
inference_share_posterior    False
generator_generation_steps   12
generator_channels_u         128
generator_share_core         False
generator_share_prior        False
pixel_sigma_i                2.0
pixel_sigma_f                0.7
pixel_n                      1000000
representation_architecture  tower
---------------------------  --------
--------------------  -------------------------------------------------------------------
mu_i                  0.0005
mu_f                  5e-05
n                     1600000.0
beta_1                0.9
beta_2                0.99
eps                   1e-08
optimizer             <chainer.optimizers.adam.Adam object at 0x7f2f9c0d6400>
multi_node_optimizer  <chainermn.optimizers._MultiNodeOptimizer object at 0x7f2f9c0d6470>
--------------------  -------------------------------------------------------------------
Iteration 1 - loss: nll: 19742.023 kld: 20.529 - lr: 4.9158e-04 - sigma_t: 1.961083 - step: 29952 - elapsed_time: 48.319 min
Iteration 2 - loss: nll: 19484.817 kld: 0.770 - lr: 4.8316e-04 - sigma_t: 1.922146 - step: 59904 - elapsed_time: 47.828 min
Iteration 3 - loss: nll: 19227.002 kld: 0.428 - lr: 4.7473e-04 - sigma_t: 1.883208 - step: 89856 - elapsed_time: 47.836 min
Iteration 4 - loss: nll: 18968.204 kld: 1.132 - lr: 4.6631e-04 - sigma_t: 1.844270 - step: 119808 - elapsed_time: 47.947 min
Iteration 5 - loss: nll: 18707.140 kld: 1.414 - lr: 4.5788e-04 - sigma_t: 1.805333 - step: 149760 - elapsed_time: 47.874 min
Iteration 6 - loss: nll: 18440.858 kld: 1.634 - lr: 4.4946e-04 - sigma_t: 1.766395 - step: 179712 - elapsed_time: 47.943 min
Iteration 7 - loss: nll: 18169.051 kld: 1.894 - lr: 4.4104e-04 - sigma_t: 1.727458 - step: 209664 - elapsed_time: 47.903 min
Iteration 8 - loss: nll: 17891.486 kld: 2.113 - lr: 4.3261e-04 - sigma_t: 1.688520 - step: 239616 - elapsed_time: 47.867 min
Iteration 9 - loss: nll: 17607.290 kld: 2.191 - lr: 4.2419e-04 - sigma_t: 1.649582 - step: 269568 - elapsed_time: 47.903 min
Iteration 10 - loss: nll: 17316.393 kld: 2.863 - lr: 4.1576e-04 - sigma_t: 1.610645 - step: 299520 - elapsed_time: 47.799 min
Iteration 11 - loss: nll: 17018.580 kld: 2.828 - lr: 4.0734e-04 - sigma_t: 1.571707 - step: 329472 - elapsed_time: 47.774 min
Iteration 12 - loss: nll: 16714.599 kld: 3.419 - lr: 3.9892e-04 - sigma_t: 1.532770 - step: 359424 - elapsed_time: 47.718 min
Iteration 13 - loss: nll: 16401.959 kld: 3.300 - lr: 3.9049e-04 - sigma_t: 1.493832 - step: 389376 - elapsed_time: 47.706 min
Iteration 14 - loss: nll: 16081.999 kld: 3.668 - lr: 3.8207e-04 - sigma_t: 1.454894 - step: 419328 - elapsed_time: 47.831 min
Iteration 15 - loss: nll: 15753.533 kld: 3.906 - lr: 3.7364e-04 - sigma_t: 1.415957 - step: 449280 - elapsed_time: 47.590 min
Iteration 16 - loss: nll: 15415.920 kld: 4.005 - lr: 3.6522e-04 - sigma_t: 1.377019 - step: 479232 - elapsed_time: 47.750 min
Iteration 17 - loss: nll: 15068.598 kld: 4.266 - lr: 3.5680e-04 - sigma_t: 1.338082 - step: 509184 - elapsed_time: 47.761 min
Iteration 18 - loss: nll: 14710.806 kld: 4.143 - lr: 3.4837e-04 - sigma_t: 1.299144 - step: 539136 - elapsed_time: 47.716 min
Iteration 19 - loss: nll: 14343.200 kld: 4.549 - lr: 3.3995e-04 - sigma_t: 1.260206 - step: 569088 - elapsed_time: 47.876 min
Iteration 20 - loss: nll: 13964.322 kld: 4.959 - lr: 3.3152e-04 - sigma_t: 1.221269 - step: 599040 - elapsed_time: 47.662 min

I am using 16 GPUs to run the training.
It will take about 2 weeks for a single GPU.

Thank you for your assistance.
Mingjia

I deleted the model above because I used wrong hyperparameters for Shepard-Matzler dataset.
I set '# of images per scene' to 5, but in the paper it is actually 15.

I am running another training and current progress is:
https://gfycat.com/gifs/detail/ColossalWaryBarb
https://thumbs.gfycat.com/ColossalWaryBarb-size_restricted.gif

Trained model:
https://drive.google.com/open?id=1yosv_TWHq53vnzY_wzvZaz5wQooHX5rs

Log:

---------------------------  --------
image_size                   (64, 64)
chrz_size                    (16, 16)
channels_r                   256
channels_chz                 64
inference_channels_map_x     64
inference_share_core         False
inference_share_posterior    False
generator_generation_steps   8
generator_channels_u         64
generator_share_core         False
generator_share_prior        False
pixel_sigma_i                2.0
pixel_sigma_f                0.7
pixel_n                      200000
representation_architecture  tower
---------------------------  --------
--------------------  -------------------------------------------------------------------
mu_i                  0.0005
mu_f                  1e-05
n                     1600000.0
beta_1                0.9
beta_2                0.99
eps                   1e-08
optimizer             <chainer.optimizers.adam.Adam object at 0x7fcf297e5198>
multi_node_optimizer  <chainermn.optimizers._MultiNodeOptimizer object at 0x7fcf297e5208>
--------------------  -------------------------------------------------------------------
Iteration 1 - loss: nll: 19344.950 kld: 46.292 - lr: 4.9198e-04 - sigma_t: 1.829856 - step: 26208 - elapsed_time: 20.545 min
Iteration 2 - loss: nll: 18175.934 kld: 7.878 - lr: 4.8396e-04 - sigma_t: 1.659504 - step: 52416 - elapsed_time: 20.021 min
Iteration 3 - loss: nll: 16921.618 kld: 7.767 - lr: 4.7593e-04 - sigma_t: 1.489152 - step: 78624 - elapsed_time: 19.977 min
Iteration 4 - loss: nll: 15525.691 kld: 3.534 - lr: 4.6791e-04 - sigma_t: 1.318800 - step: 104832 - elapsed_time: 19.830 min
Iteration 5 - loss: nll: 13948.513 kld: 0.614 - lr: 4.5988e-04 - sigma_t: 1.148448 - step: 131040 - elapsed_time: 20.053 min
Iteration 6 - loss: nll: 12133.166 kld: 0.457 - lr: 4.5185e-04 - sigma_t: 0.978096 - step: 157248 - elapsed_time: 20.008 min
Iteration 7 - loss: nll: 9992.592 kld: 2.734 - lr: 4.4383e-04 - sigma_t: 0.807744 - step: 183456 - elapsed_time: 20.005 min
Iteration 8 - loss: nll: 7601.920 kld: 7.581 - lr: 4.3580e-04 - sigma_t: 0.700000 - step: 209664 - elapsed_time: 20.067 min
Iteration 9 - loss: nll: 7025.208 kld: 10.905 - lr: 4.2777e-04 - sigma_t: 0.700000 - step: 235872 - elapsed_time: 19.906 min
Iteration 10 - loss: nll: 7008.173 kld: 12.113 - lr: 4.1975e-04 - sigma_t: 0.700000 - step: 262080 - elapsed_time: 20.243 min
Iteration 11 - loss: nll: 6995.622 kld: 13.149 - lr: 4.1172e-04 - sigma_t: 0.700000 - step: 288288 - elapsed_time: 20.293 min
Iteration 12 - loss: nll: 6987.068 kld: 12.484 - lr: 4.0370e-04 - sigma_t: 0.700000 - step: 314496 - elapsed_time: 20.216 min
Iteration 13 - loss: nll: 6982.877 kld: 12.799 - lr: 3.9567e-04 - sigma_t: 0.700000 - step: 340704 - elapsed_time: 19.870 min
Iteration 14 - loss: nll: 6977.923 kld: 11.593 - lr: 3.8764e-04 - sigma_t: 0.700000 - step: 366912 - elapsed_time: 20.069 min
Iteration 15 - loss: nll: 6973.589 kld: 10.247 - lr: 3.7962e-04 - sigma_t: 0.700000 - step: 393120 - elapsed_time: 20.059 min
Iteration 16 - loss: nll: 6971.020 kld: 9.458 - lr: 3.7159e-04 - sigma_t: 0.700000 - step: 419328 - elapsed_time: 19.875 min
Iteration 17 - loss: nll: 6968.490 kld: 9.044 - lr: 3.6356e-04 - sigma_t: 0.700000 - step: 445536 - elapsed_time: 19.971 min
Iteration 18 - loss: nll: 6965.432 kld: 7.537 - lr: 3.5554e-04 - sigma_t: 0.700000 - step: 471744 - elapsed_time: 20.030 min
Iteration 19 - loss: nll: 6964.166 kld: 7.867 - lr: 3.4751e-04 - sigma_t: 0.700000 - step: 497952 - elapsed_time: 20.043 min
Iteration 20 - loss: nll: 6962.484 kld: 7.499 - lr: 3.3949e-04 - sigma_t: 0.700000 - step: 524160 - elapsed_time: 20.109 min
Iteration 21 - loss: nll: 6960.499 kld: 6.730 - lr: 3.3146e-04 - sigma_t: 0.700000 - step: 550368 - elapsed_time: 20.000 min
Iteration 22 - loss: nll: 6959.758 kld: 7.090 - lr: 3.2343e-04 - sigma_t: 0.700000 - step: 576576 - elapsed_time: 19.829 min
Iteration 23 - loss: nll: 6957.147 kld: 5.866 - lr: 3.1541e-04 - sigma_t: 0.700000 - step: 602784 - elapsed_time: 20.247 min
Iteration 24 - loss: nll: 6956.106 kld: 5.453 - lr: 3.0738e-04 - sigma_t: 0.700000 - step: 628992 - elapsed_time: 20.095 min
Iteration 25 - loss: nll: 6955.350 kld: 5.701 - lr: 2.9935e-04 - sigma_t: 0.700000 - step: 655200 - elapsed_time: 19.960 min
Iteration 26 - loss: nll: 6955.393 kld: 6.204 - lr: 2.9133e-04 - sigma_t: 0.700000 - step: 681408 - elapsed_time: 19.941 min
Iteration 27 - loss: nll: 6952.686 kld: 4.983 - lr: 2.8330e-04 - sigma_t: 0.700000 - step: 707616 - elapsed_time: 20.034 min
Iteration 28 - loss: nll: 6952.135 kld: 4.870 - lr: 2.7528e-04 - sigma_t: 0.700000 - step: 733824 - elapsed_time: 20.021 min
Iteration 29 - loss: nll: 6951.362 kld: 5.005 - lr: 2.6725e-04 - sigma_t: 0.700000 - step: 760032 - elapsed_time: 20.170 min
Iteration 30 - loss: nll: 6948.648 kld: 3.563 - lr: 2.5922e-04 - sigma_t: 0.700000 - step: 786240 - elapsed_time: 20.480 min
Iteration 31 - loss: nll: 6950.301 kld: 5.259 - lr: 2.5120e-04 - sigma_t: 0.700000 - step: 812448 - elapsed_time: 20.044 min
Iteration 32 - loss: nll: 6949.056 kld: 4.624 - lr: 2.4317e-04 - sigma_t: 0.700000 - step: 838656 - elapsed_time: 19.992 min
Iteration 33 - loss: nll: 6947.763 kld: 4.238 - lr: 2.3515e-04 - sigma_t: 0.700000 - step: 864864 - elapsed_time: 20.387 min
Iteration 34 - loss: nll: 6947.670 kld: 4.554 - lr: 2.2712e-04 - sigma_t: 0.700000 - step: 891072 - elapsed_time: 20.202 min
Iteration 35 - loss: nll: 6947.991 kld: 4.770 - lr: 2.1909e-04 - sigma_t: 0.700000 - step: 917280 - elapsed_time: 19.941 min
Iteration 36 - loss: nll: 6947.734 kld: 4.866 - lr: 2.1107e-04 - sigma_t: 0.700000 - step: 943488 - elapsed_time: 19.944 min
Iteration 37 - loss: nll: 6945.932 kld: 3.673 - lr: 2.0304e-04 - sigma_t: 0.700000 - step: 969696 - elapsed_time: 20.036 min
Iteration 38 - loss: nll: 6945.520 kld: 4.351 - lr: 1.9501e-04 - sigma_t: 0.700000 - step: 995904 - elapsed_time: 20.269 min
Iteration 39 - loss: nll: 6944.158 kld: 3.780 - lr: 1.8699e-04 - sigma_t: 0.700000 - step: 1022112 - elapsed_time: 20.302 min
Iteration 40 - loss: nll: 6943.310 kld: 3.397 - lr: 1.7896e-04 - sigma_t: 0.700000 - step: 1048320 - elapsed_time: 20.084 min
Iteration 41 - loss: nll: 6943.453 kld: 3.529 - lr: 1.7094e-04 - sigma_t: 0.700000 - step: 1074528 - elapsed_time: 20.163 min
Iteration 42 - loss: nll: 6944.061 kld: 4.264 - lr: 1.6291e-04 - sigma_t: 0.700000 - step: 1100736 - elapsed_time: 20.092 min
Iteration 43 - loss: nll: 6943.606 kld: 4.037 - lr: 1.5488e-04 - sigma_t: 0.700000 - step: 1126944 - elapsed_time: 20.176 min
Iteration 44 - loss: nll: 6942.461 kld: 3.558 - lr: 1.4686e-04 - sigma_t: 0.700000 - step: 1153152 - elapsed_time: 20.106 min
Iteration 45 - loss: nll: 6941.158 kld: 2.616 - lr: 1.3883e-04 - sigma_t: 0.700000 - step: 1179360 - elapsed_time: 20.422 min
Iteration 46 - loss: nll: 6941.918 kld: 3.560 - lr: 1.3080e-04 - sigma_t: 0.700000 - step: 1205568 - elapsed_time: 20.158 min
Iteration 47 - loss: nll: 6940.930 kld: 3.104 - lr: 1.2278e-04 - sigma_t: 0.700000 - step: 1231776 - elapsed_time: 20.173 min
Iteration 48 - loss: nll: 6940.197 kld: 2.989 - lr: 1.1475e-04 - sigma_t: 0.700000 - step: 1257984 - elapsed_time: 20.296 min
Iteration 49 - loss: nll: 6940.498 kld: 3.417 - lr: 1.0673e-04 - sigma_t: 0.700000 - step: 1284192 - elapsed_time: 20.160 min
Iteration 50 - loss: nll: 6941.139 kld: 3.853 - lr: 9.8700e-05 - sigma_t: 0.700000 - step: 1310400 - elapsed_time: 19.795 min
Iteration 51 - loss: nll: 6938.916 kld: 2.582 - lr: 9.0674e-05 - sigma_t: 0.700000 - step: 1336608 - elapsed_time: 20.320 min
Iteration 52 - loss: nll: 6940.421 kld: 3.877 - lr: 8.2647e-05 - sigma_t: 0.700000 - step: 1362816 - elapsed_time: 20.304 min
Iteration 53: Subset 225 / 698: Batch 33 / 39 - loss: nll: 6967.296 kld: 0.054 - lr: 7.9658e-05 - sigma_t: 0.7000000

Thank you

Hi @musyoku
sorry to brother you,
can you tell me how long do you take to train the model with 16 GPUs?
what type of gpu do you use?
with SSD?

thank you for your kind answer,
mingjia

Hi.
I updated a whole bunch of code.
I found that the original loss function converges slow, so I added a reconstruction loss and a new annealing strategy for the pixel-variance.
Now you can train GQN on a single GPU.

However, adding the above update does not converge the model.
The model still can not produce images with the same quality as DeepMind reported.

I am using GTX 1080Ti with SSD.

Here is my current training progress:

result_1

The model was trained on a new dataset (contains 200,000 scenes) and the log is as follows:

device 2 / 4
device 0 / 4
device 3 / 4
device 1 / 4
-------------------------------  --------
image_size                       (64, 64)
chrz_size                        (16, 16)
h_channels                       64
z_channels                       3
inference_share_core             False
inference_share_posterior        False
inference_downsampler_channels   128
inference_lstm_peephole_enabled  False
generator_generation_steps       6
generator_u_channels             16
generator_share_core             False
generator_share_prior            False
generator_share_upsampler        False
generator_lstm_peephole_enabled  False
pixel_sigma_i                    2.0
pixel_sigma_f                    0.7
pixel_n                          250000
representation_architecture      tower
representation_channels          256
-------------------------------  --------
--------------------  -------------------------------------------------------------------
mu_i                  0.0005
mu_f                  0.0005
n                     1000000
beta_1                0.9
beta_2                0.99
eps                   1e-08
optimizer             <chainer.optimizers.adam.Adam object at 0x7f582717a5c0>
multi_node_optimizer  <chainermn.optimizers._MultiNodeOptimizer object at 0x7f582717a630>
--------------------  -------------------------------------------------------------------
---------------------  --------
sigma_start                 2
sigma_end                   0.7
pretrain_steps          50000
final_num_updates      250000
pixel_variance              1
kl_weight                   0
reconstruction_weight       1
---------------------  --------
Iteration 1 - loss: nll_per_pixel: 0.924396 mse: 0.014844 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 5500 - elapsed_time: 18.448 min
Iteration 2 - loss: nll_per_pixel: 0.920877 mse: 0.007811 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 11000 - elapsed_time: 18.124 min
Iteration 3 - loss: nll_per_pixel: 0.920372 mse: 0.006547 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 16500 - elapsed_time: 18.121 min
Iteration 4 - loss: nll_per_pixel: 0.920130 mse: 0.005929 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 22000 - elapsed_time: 18.223 min
Iteration 5 - loss: nll_per_pixel: 0.919975 mse: 0.005494 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 27500 - elapsed_time: 18.193 min
Iteration 6 - loss: nll_per_pixel: 0.919860 mse: 0.005150 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 33000 - elapsed_time: 18.343 min
Iteration 7 - loss: nll_per_pixel: 0.919782 mse: 0.005040 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 38500 - elapsed_time: 18.161 min
Iteration 8 - loss: nll_per_pixel: 0.919723 mse: 0.004824 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 44000 - elapsed_time: 18.282 min
Iteration 9 - loss: nll_per_pixel: 0.919685 mse: 0.004658 kld: 0.000000 - lr: 5.0000e-04 - pixel_variance: 1.000000 - step: 49500 - elapsed_time: 18.284 min
Iteration 10 - loss: nll_per_pixel: 1.420329 mse: 0.025532 kld: 1293.906355 - lr: 5.0000e-04 - pixel_variance: 1.714000 - step: 55000 - elapsed_time: 18.235 min
Iteration 11 - loss: nll_per_pixel: 1.453067 mse: 0.018676 kld: 316.632436 - lr: 5.0000e-04 - pixel_variance: 1.685400 - step: 60500 - elapsed_time: 18.314 min
Iteration 12 - loss: nll_per_pixel: 1.435901 mse: 0.016957 kld: 33.952145 - lr: 5.0000e-04 - pixel_variance: 1.656800 - step: 66000 - elapsed_time: 18.192 min
Iteration 13 - loss: nll_per_pixel: 1.418450 mse: 0.015359 kld: 0.385697 - lr: 5.0000e-04 - pixel_variance: 1.628200 - step: 71500 - elapsed_time: 18.282 min
Iteration 14 - loss: nll_per_pixel: 1.400922 mse: 0.015030 kld: 0.289059 - lr: 5.0000e-04 - pixel_variance: 1.599600 - step: 77000 - elapsed_time: 18.099 min
Iteration 15 - loss: nll_per_pixel: 1.383007 mse: 0.014333 kld: 0.217874 - lr: 5.0000e-04 - pixel_variance: 1.571000 - step: 82500 - elapsed_time: 18.190 min
Iteration 16 - loss: nll_per_pixel: 1.364752 mse: 0.013578 kld: 0.188471 - lr: 5.0000e-04 - pixel_variance: 1.542400 - step: 88000 - elapsed_time: 18.222 min
Iteration 17 - loss: nll_per_pixel: 1.346194 mse: 0.013016 kld: 0.136076 - lr: 5.0000e-04 - pixel_variance: 1.513800 - step: 93500 - elapsed_time: 18.450 min
Iteration 18 - loss: nll_per_pixel: 1.327359 mse: 0.012800 kld: 0.127023 - lr: 5.0000e-04 - pixel_variance: 1.485200 - step: 99000 - elapsed_time: 18.394 min
Iteration 19 - loss: nll_per_pixel: 1.308202 mse: 0.012749 kld: 0.125739 - lr: 5.0000e-04 - pixel_variance: 1.456600 - step: 104500 - elapsed_time: 18.254 min
Iteration 20 - loss: nll_per_pixel: 1.288512 mse: 0.012037 kld: 0.131593 - lr: 5.0000e-04 - pixel_variance: 1.428000 - step: 110000 - elapsed_time: 18.301 min
Iteration 21 - loss: nll_per_pixel: 1.268547 mse: 0.011820 kld: 0.117286 - lr: 5.0000e-04 - pixel_variance: 1.399400 - step: 115500 - elapsed_time: 18.752 min
Iteration 22 - loss: nll_per_pixel: 1.248141 mse: 0.011465 kld: 0.123291 - lr: 5.0000e-04 - pixel_variance: 1.370800 - step: 121000 - elapsed_time: 20.560 min
Iteration 23 - loss: nll_per_pixel: 1.227414 mse: 0.011510 kld: 0.133630 - lr: 5.0000e-04 - pixel_variance: 1.342200 - step: 126500 - elapsed_time: 20.440 min
Iteration 24 - loss: nll_per_pixel: 1.206132 mse: 0.011126 kld: 0.140435 - lr: 5.0000e-04 - pixel_variance: 1.313600 - step: 132000 - elapsed_time: 21.330 min
Iteration 25 - loss: nll_per_pixel: 1.184431 mse: 0.010896 kld: 0.144380 - lr: 5.0000e-04 - pixel_variance: 1.285000 - step: 137500 - elapsed_time: 20.552 min
Iteration 26 - loss: nll_per_pixel: 1.162383 mse: 0.011103 kld: 0.147033 - lr: 5.0000e-04 - pixel_variance: 1.256400 - step: 143000 - elapsed_time: 20.140 min
Iteration 27 - loss: nll_per_pixel: 1.139691 mse: 0.010837 kld: 0.146818 - lr: 5.0000e-04 - pixel_variance: 1.227800 - step: 148500 - elapsed_time: 21.143 min
Iteration 28 - loss: nll_per_pixel: 1.116534 mse: 0.010750 kld: 0.151079 - lr: 5.0000e-04 - pixel_variance: 1.199200 - step: 154000 - elapsed_time: 19.329 min
Iteration 29 - loss: nll_per_pixel: 1.092780 mse: 0.010514 kld: 0.162968 - lr: 5.0000e-04 - pixel_variance: 1.170600 - step: 159500 - elapsed_time: 20.594 min
Iteration 30 - loss: nll_per_pixel: 1.068442 mse: 0.010277 kld: 0.166883 - lr: 5.0000e-04 - pixel_variance: 1.142000 - step: 165000 - elapsed_time: 19.451 min
Iteration 31 - loss: nll_per_pixel: 1.043461 mse: 0.009939 kld: 0.169416 - lr: 5.0000e-04 - pixel_variance: 1.113400 - step: 170500 - elapsed_time: 20.604 min
Iteration 32 - loss: nll_per_pixel: 1.018130 mse: 0.010308 kld: 0.175877 - lr: 5.0000e-04 - pixel_variance: 1.084800 - step: 176000 - elapsed_time: 18.697 min
Iteration 33 - loss: nll_per_pixel: 0.991845 mse: 0.009971 kld: 0.180837 - lr: 5.0000e-04 - pixel_variance: 1.056200 - step: 181500 - elapsed_time: 18.620 min
Iteration 34 - loss: nll_per_pixel: 0.965044 mse: 0.010048 kld: 0.183863 - lr: 5.0000e-04 - pixel_variance: 1.027600 - step: 187000 - elapsed_time: 18.948 min
Iteration 35 - loss: nll_per_pixel: 0.937390 mse: 0.009877 kld: 0.214600 - lr: 5.0000e-04 - pixel_variance: 0.999000 - step: 192500 - elapsed_time: 19.027 min
Iteration 36 - loss: nll_per_pixel: 0.909048 mse: 0.009892 kld: 0.759714 - lr: 5.0000e-04 - pixel_variance: 0.970400 - step: 198000 - elapsed_time: 18.797 min
Iteration 37 - loss: nll_per_pixel: 0.879364 mse: 0.008953 kld: 4.162931 - lr: 5.0000e-04 - pixel_variance: 0.941800 - step: 203500 - elapsed_time: 19.031 min
Iteration 38 - loss: nll_per_pixel: 0.848626 mse: 0.007791 kld: 5.443319 - lr: 5.0000e-04 - pixel_variance: 0.913200 - step: 209000 - elapsed_time: 19.439 min
Iteration 39 - loss: nll_per_pixel: 0.817348 mse: 0.007390 kld: 6.950443 - lr: 5.0000e-04 - pixel_variance: 0.884600 - step: 214500 - elapsed_time: 19.375 min
Iteration 40 - loss: nll_per_pixel: 0.785102 mse: 0.007063 kld: 7.883418 - lr: 5.0000e-04 - pixel_variance: 0.856000 - step: 220000 - elapsed_time: 19.384 min
Iteration 41 - loss: nll_per_pixel: 0.751775 mse: 0.006734 kld: 9.029592 - lr: 5.0000e-04 - pixel_variance: 0.827400 - step: 225500 - elapsed_time: 19.603 min
Iteration 42 - loss: nll_per_pixel: 0.717302 mse: 0.006414 kld: 10.422601 - lr: 5.0000e-04 - pixel_variance: 0.798800 - step: 231000 - elapsed_time: 19.537 min
Iteration 43 - loss: nll_per_pixel: 0.681670 mse: 0.006192 kld: 12.298899 - lr: 5.0000e-04 - pixel_variance: 0.770200 - step: 236500 - elapsed_time: 19.648 min
Iteration 44 - loss: nll_per_pixel: 0.644603 mse: 0.005834 kld: 12.114840 - lr: 5.0000e-04 - pixel_variance: 0.741600 - step: 242000 - elapsed_time: 20.560 min
Iteration 45 - loss: nll_per_pixel: 0.606224 mse: 0.005607 kld: 14.715453 - lr: 5.0000e-04 - pixel_variance: 0.713000 - step: 247500 - elapsed_time: 20.078 min
Iteration 46 - loss: nll_per_pixel: 0.572365 mse: 0.005384 kld: 16.051887 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 253000 - elapsed_time: 19.844 min
Iteration 47 - loss: nll_per_pixel: 0.568072 mse: 0.005250 kld: 17.193420 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 258500 - elapsed_time: 19.966 min
Iteration 48 - loss: nll_per_pixel: 0.567856 mse: 0.005038 kld: 16.925087 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 264000 - elapsed_time: 20.403 min
Iteration 49 - loss: nll_per_pixel: 0.567759 mse: 0.004943 kld: 17.232433 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 269500 - elapsed_time: 19.956 min
Iteration 50 - loss: nll_per_pixel: 0.567621 mse: 0.004808 kld: 16.815233 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 275000 - elapsed_time: 20.335 min
Iteration 51 - loss: nll_per_pixel: 0.567501 mse: 0.004691 kld: 17.876824 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 280500 - elapsed_time: 19.774 min
Iteration 52 - loss: nll_per_pixel: 0.567381 mse: 0.004573 kld: 17.499159 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 286000 - elapsed_time: 20.266 min
Iteration 53 - loss: nll_per_pixel: 0.567304 mse: 0.004498 kld: 18.040683 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 291500 - elapsed_time: 19.816 min
Iteration 54 - loss: nll_per_pixel: 0.567217 mse: 0.004412 kld: 18.361821 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 297000 - elapsed_time: 19.565 min
Iteration 55 - loss: nll_per_pixel: 0.567107 mse: 0.004305 kld: 18.061031 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 302500 - elapsed_time: 20.035 min
Iteration 56 - loss: nll_per_pixel: 0.566987 mse: 0.004186 kld: 17.097323 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 308000 - elapsed_time: 20.096 min
Iteration 57 - loss: nll_per_pixel: 0.566910 mse: 0.004111 kld: 17.262740 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 313500 - elapsed_time: 20.244 min
Iteration 58 - loss: nll_per_pixel: 0.566868 mse: 0.004070 kld: 17.938203 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 319000 - elapsed_time: 19.994 min
Iteration 59 - loss: nll_per_pixel: 0.566811 mse: 0.004014 kld: 18.487184 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 324500 - elapsed_time: 20.062 min
Iteration 60 - loss: nll_per_pixel: 0.566742 mse: 0.003947 kld: 18.615904 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 330000 - elapsed_time: 19.971 min
Iteration 61 - loss: nll_per_pixel: 0.566631 mse: 0.003838 kld: 17.258627 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 335500 - elapsed_time: 20.848 min
Iteration 62 - loss: nll_per_pixel: 0.566607 mse: 0.003815 kld: 17.725936 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 341000 - elapsed_time: 22.554 min
Iteration 63 - loss: nll_per_pixel: 0.566572 mse: 0.003780 kld: 18.235667 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 346500 - elapsed_time: 22.640 min
Iteration 64 - loss: nll_per_pixel: 0.566508 mse: 0.003717 kld: 17.953023 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 352000 - elapsed_time: 21.592 min
Iteration 65 - loss: nll_per_pixel: 0.566465 mse: 0.003675 kld: 17.570079 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 357500 - elapsed_time: 19.826 min
Iteration 66 - loss: nll_per_pixel: 0.566424 mse: 0.003635 kld: 17.481222 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 363000 - elapsed_time: 19.876 min
Iteration 67 - loss: nll_per_pixel: 0.566376 mse: 0.003588 kld: 17.899838 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 368500 - elapsed_time: 19.825 min
Iteration 68 - loss: nll_per_pixel: 0.566384 mse: 0.003595 kld: 19.105071 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 374000 - elapsed_time: 19.627 min
Iteration 69 - loss: nll_per_pixel: 0.566255 mse: 0.003469 kld: 16.578841 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 379500 - elapsed_time: 19.845 min
Iteration 70 - loss: nll_per_pixel: 0.566228 mse: 0.003443 kld: 16.395228 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 385000 - elapsed_time: 19.843 min
Iteration 71 - loss: nll_per_pixel: 0.566216 mse: 0.003431 kld: 16.970923 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 390500 - elapsed_time: 19.513 min
Iteration 72 - loss: nll_per_pixel: 0.566179 mse: 0.003395 kld: 16.877083 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 396000 - elapsed_time: 18.717 min
Iteration 73 - loss: nll_per_pixel: 0.566187 mse: 0.003402 kld: 17.724905 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 401500 - elapsed_time: 19.795 min
Iteration 74 - loss: nll_per_pixel: 0.566149 mse: 0.003365 kld: 17.584423 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 407000 - elapsed_time: 20.194 min
Iteration 75 - loss: nll_per_pixel: 0.566094 mse: 0.003312 kld: 16.308557 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 412500 - elapsed_time: 20.264 min
Iteration 76 - loss: nll_per_pixel: 0.566109 mse: 0.003326 kld: 17.543204 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 418000 - elapsed_time: 20.652 min
Iteration 77 - loss: nll_per_pixel: 0.566086 mse: 0.003304 kld: 17.366459 - lr: 5.0000e-04 - pixel_variance: 0.700000 - step: 423500 - elapsed_time: 20.571 min

I am using 4 GPUs because I found that when I use more than 4 GPUs the loss converges slow.

@musyoku thank you for your kind answer!

Hi @musyoku

the loss function that you have changed is
loss = (loss_nll / scheduler.pixel_variance) + loss_kld?

do you use the mse loss to optimal the network parameters?

And the demo of MNIST Dice (https://gfycat.com/ShamelessWindyFruitfly) is also produced by the original gqn?

Looking for your kind answer!
Thank you!

mse loss that I added converge faster in terms of reconstruction, but final results were not so good
so I reverted it back

I emailed the author to ask detail
It will take 3 months to train a model at 64x64 resolution, so I was advised to run a training at a lower resolution(32x32) to iterate faster

Ok, thank you for your reply.

so the loss function of the demo Shepard-Matzler (https://gfycat.com/gifs/detail/FickleCoolAnglerfish) is the following loss = (loss_nll / scheduler.pixel_variance) + loss_kld?

how long do you take to train the model that used to rendering the final demo Shepard-Matzler and Rooms?
(https://gfycat.com/gifs/detail/FickleCoolAnglerfish, https://gfycat.com/gifs/detail/UnrealisticJoyousGlobefish)

so the loss function of the demo Shepard-Matzler is the following loss = (loss_nll / scheduler.pixel_variance) + loss_kld?

Yes

how long do you take to train the model that used to rendering the final demo Shepard-Matzler and Rooms?

4 days

Ok, thank you.

Hi @musyoku ,

do you have experiment the paper of Consistent Jumpy Predictions for Videos and Scenes?

not yet

Ok, thank you for your reply.