understanding tensorflow autoencoder
christinazavou opened this issue · comments
Hello again,
I'm trying to run the Tensorflow implementation for the autoencoder and I have a few questions (excuse me it got too long but i tried to be detailed).
In the repo you have a config file (ae_resnet.yaml) for running a resnet with ".points" tfrecords. I have run this and using the decode_shape gives me pretty good results! However, this is not the case when I try to run an autoencoder with ocnn and ".octree" tfrecords.
Here are the steps I followed:
Step 1. i used python data/completion.py --run generate_dataset
which downloaded the ".points" files of the completion dataset and generated:
shape.ply
shape.points
test.scans.ply
test.scans.points
completion_test_points.tfrecords
completion_test_scans_points.tfrecords
completion_train_points.tfrecords
completion_train_points.camera_path.dict
filelist_test.txt
filelist_test_scans.txt
filelist_train.txt
Step 2. under each category folder in shape.points i generated a list.txt that includes the paths of its points and run octree --filenames category/list.txt --output_path shape.octrees/category --depth 6 --split_label 1 --rot_num 6
Here, I used --depth 6 and --split_label 1 because I saw these parameters in ae_resnet.yaml.
Step 3. similarly to filelist_test.txt and filelist_train.txt I generated files that include paths of the corresponding octrees (for each .points file path i included the 6 .octree file paths) and then I made octree tfrecords. Specifically these corresponding files:
completion_test_octrees.tfrecords
completion_test_scans_octrees.tfrecords
completion_train_octrees.tfrecords
Step 4. i run the autoencoder with ae_ocnn.yaml:
SOLVER:
gpu: 0,
logdir: /output/ocnn_completion/ae/ocnn_b16
run: train
max_iter: 20000
test_iter: 336
test_every_iter: 400
step_size: (80000,)
ckpt_num: 20
DATA:
train:
dtype: octree
depth: 6
location: /completion_train_octrees.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
test:
dtype: octree
depth: 6
location: /completion_test_octrees.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
MODEL:
name: ocnn
channel: 3
nout: 32
depth: 6
LOSS:
weight_decay: 0.0005
Results I got
What I noticed is that the resnet with points is taking more time to do the same amount of batch iterations (1 hour equals to 2800 iterations in resnet and 12000 iterations in ocnn) and the accuracy and loss of resnet are much smoother than the ones in ocnn. e.g.
resnet:
ocnn:
Also, the decoded octrees from resnet are much better than the ones from ocnn: e.g.
at 6K batch iterations of resnet:
at 20K batch iterations of ocnn:
Afterwards I realized I can run the ocnn model with the ".points" tfrecords by using ae_resnet.yaml and chaning model name from octree to ocnn. This gave reasonable results, but I still dont understand why creating octrees and giving those as input didn't work.
Below I list my questions:
Q1.1 Is my configuration for ocnn autoencoder wrong? Is it expected that ocnn runs much faster because its more efficient than resnet? Do i have to train it for much longer time in order to get as good results as with resnet?
Q1.2 Regarding the input signal, in ae_resnet.yaml the input channel is 4 and I realized after some time that this is due to node_dis: True. Can you explain what this is?
Q1.3 When I run the decode_shape mode, and convert the input octrees into mesh with octree2mesh
the original shape i get from ae_resnet.yaml and ae_ocnn.yaml are slightly different. Is this because I used 6 rotations in the octree generation thus octree with suffix "_6_2_000.octree" is not entirely same as octree created from point-cloud?
Q2. When I tried to run autoencoder with adaptive points or octrees it happened that loss4 was extremely big resulting in total loss of nan and then training was stuck? Do you have a hint what the problem is? (note: for adaptive octrees i used the step 2 mentioned above with extra argument --adaptive 4
)
Adaptive resnet:
test summaries:
config:
SOLVER:
gpu: 0,
logdir: /media/christina/Data/ANFASS_data/O-CNN/output/ocnn_completion/ae/aresnet_b16
run: train
max_iter: 20000
test_iter: 336
test_every_iter: 400
step_size: (80000,)
ckpt_num: 20
DATA:
train:
dtype: points
depth: 6
location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_train_points.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
adaptive: True
test:
dtype: points
depth: 6
location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_test_points.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
adaptive: True
MODEL:
name: resnet
channel: 4
nout: 32 # The channel of the hidden code, the code length is 4*4*4*32 (2048)
depth: 6
LOSS:
weight_decay: 0.0005
Adaptive ocnn:
test summaries:
config:
SOLVER:
gpu: 0,
logdir: /media/christina/Data/ANFASS_data/O-CNN/output/ocnn_completion/ae/aocnn_b16
run: train
max_iter: 20000
test_iter: 336
test_every_iter: 400
step_size: (80000,)
ckpt_num: 20
DATA:
train:
dtype: octree
depth: 6
location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_train_aoctrees.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
test:
dtype: octree
depth: 6
location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_test_aoctrees.tfrecords
batch_size: 16
distort: False
offset: 0.0
node_dis: True
split_label: True
MODEL:
name: ocnn
channel: 3
nout: 32 # The channel of the hidden code, the code length is 4*4*4*32 (2048)
depth: 6
LOSS:
weight_decay: 0.0005
Q3. What does the output of check_octree means:
example of an octree generated with octree ... --depth 6 --split_label 1 --rot_num 6
:
===============
../ocnn_completion/.../02691156/1a04e3eab45ca15dd86060f189eb133_6_2_000.octree infomation:
This is a valid octree!
magic_str:_OCTREE_1.0_
batch_size: 1
depth: 6
full_layer: 2
adaptive_layer: 4
threshold_distance: 2
threshold_normal: 0.1
is_adaptive: 0
has_displace: 0
nnum: 1 8 64 224 1000 4336 17320 0 0 0 0 0 0 0 0 0
nnum_cum: 0 1 9 73 297 1297 5633 22953 22953 0 0 0 0 0 0 0
nnum_nempty: 1 8 28 125 542 2165 8177 0 0 0 0 0 0 0 0 0
total_nnum: 22953
total_nnum_capacity: 22953
channel: 1 1 0 3 0 1
locations: -1 -1 0 6 0 -1
bbox_max: 128.635 130.831 120.122
bbox_min: 17.2621 19.4578 8.74912
key2xyz: 0
sizeof_octree: 483992
===============
Q3.1. i guess adaptive_layer:4 is dummy because of is_adaptive: 0 ?!
Q3.2. what is the locations parameter showing?
Q3.3 what is the channel parameter showing? I understood it corresponds to split, label, feature, xyz, index and one more property...which is this property and what is their order?
Q1.1 Is my configuration for ocnn autoencoder wrong?
Your configuration is correct.
Q1.1 Is it expected that ocnn runs much faster because its more efficient than resnet?
Yes, the ocnn contains much less convolution layers than the resnet, and it runs much faster.
Q1.1 Do i have to train it for much longer time in order to get as good results as with resnet?
The ocnn contains less convolution layers, as well as less trainable parameters. The fitting ability of a shallow ocnn is weaker. So the performance can not compete with resnet even with longer training time.
Q1.2 Regarding the input signal, in ae_resnet.yaml the input channel is 4 and I realized after some time that this is due to node_dis: True. Can you explain what this is?
In the ae_resnet.yaml, the parameter
node_dis
isTrue
, the octree nodes contain 4 channel signals (normal: 3 channel + local displacement: 1 channel), which can be used to represent a local planar patch in each octree node, i.e. the normal represents the orientation of the patch and the displacement represents the distance between the patch and the center of the corresponding octree node.
Q1.3 When I run the decode_shape mode, and convert the input octrees into mesh with octree2mesh the original shape i get from ae_resnet.yaml and ae_ocnn.yaml are slightly different. Is this because I used 6 rotations in the octree generation thus octree with suffix "_6_2_000.octree" is not entirely same as octree created from point-cloud?
When runing the tool
octree
, you can try the following command:octree --filenames category/list.txt --output_path shape.octrees/category --depth 6 --split_label 1 --rot_num 6 --node_dis 1
. By default, thenode_dis
is off for the tooloctree
. Then the generated octrees will be the same.
Q2. When I tried to run autoencoder with adaptive points or octrees it happened that loss4 was extremely big resulting in total loss of nan and then training was stuck? Do you have a hint what the problem is?
The autoencoder for adaptive octree generation is implemented with Caffe. I have not tested the adaptive octree with TensorFlow. I read your configuration file, and it is OK. But there may be errors in the code.
Q3. What does the output of check_octree means:
The
check_octree
outputs some information for debugging purposes.
Q3.1. i guess adaptive_layer:4 is dummy because of is_adaptive: 0 ?
Yes, when the
is_adaptive
is 0, adaptive_layer:4 is dummy.
Q3.2. what is the locations parameter showing?
The
location
indicates which octree level the corresponding octree properties exists. Iflocation
is 5, it means the property exists in octree level 5. Iflocation
is -1, it means the property exists in every octree level.
Q3.3 what is the channel parameter showing? I understood it corresponds to split, label, feature, xyz, index and one more property...which is this property and what is their order?
The property is defined in https://github.com/microsoft/O-CNN/blob/master/octree/octree/octree_info.h
enum PropType { kKey = 1, kChild = 2, kNeigh = 4, kFeature = 8, kLabel = 16, kSplit = 32 };There is another property for the neighborhood searching of octree nodes, which is used when doing convolution.
thank you!