microsoft / O-CNN

O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

understanding tensorflow autoencoder

christinazavou opened this issue · comments

Hello again,

I'm trying to run the Tensorflow implementation for the autoencoder and I have a few questions (excuse me it got too long but i tried to be detailed).

In the repo you have a config file (ae_resnet.yaml) for running a resnet with ".points" tfrecords. I have run this and using the decode_shape gives me pretty good results! However, this is not the case when I try to run an autoencoder with ocnn and ".octree" tfrecords.

Here are the steps I followed:

Step 1. i used python data/completion.py --run generate_dataset which downloaded the ".points" files of the completion dataset and generated:

shape.ply
shape.points
test.scans.ply
test.scans.points
completion_test_points.tfrecords
completion_test_scans_points.tfrecords
completion_train_points.tfrecords
completion_train_points.camera_path.dict
filelist_test.txt
filelist_test_scans.txt
filelist_train.txt

Step 2. under each category folder in shape.points i generated a list.txt that includes the paths of its points and run octree --filenames category/list.txt --output_path shape.octrees/category --depth 6 --split_label 1 --rot_num 6

Here, I used --depth 6 and --split_label 1 because I saw these parameters in ae_resnet.yaml.

Step 3. similarly to filelist_test.txt and filelist_train.txt I generated files that include paths of the corresponding octrees (for each .points file path i included the 6 .octree file paths) and then I made octree tfrecords. Specifically these corresponding files:

completion_test_octrees.tfrecords
completion_test_scans_octrees.tfrecords
completion_train_octrees.tfrecords

Step 4. i run the autoencoder with ae_ocnn.yaml:

SOLVER:
  gpu: 0,
  logdir: /output/ocnn_completion/ae/ocnn_b16
  run: train
  max_iter: 20000
  test_iter: 336
  test_every_iter: 400
  step_size: (80000,)
  ckpt_num: 20

DATA:
  train:
    dtype: octree
    depth: 6
    location: /completion_train_octrees.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True

  test: 
    dtype: octree
    depth: 6
    location: /completion_test_octrees.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True

MODEL:
  name: ocnn
  channel: 3
  nout: 32
  depth: 6

LOSS:
  weight_decay: 0.0005

Results I got

What I noticed is that the resnet with points is taking more time to do the same amount of batch iterations (1 hour equals to 2800 iterations in resnet and 12000 iterations in ocnn) and the accuracy and loss of resnet are much smoother than the ones in ocnn. e.g.
resnet:
image
ocnn:
image

Also, the decoded octrees from resnet are much better than the ones from ocnn: e.g.

at 6K batch iterations of resnet:
image
at 20K batch iterations of ocnn:
image

Afterwards I realized I can run the ocnn model with the ".points" tfrecords by using ae_resnet.yaml and chaning model name from octree to ocnn. This gave reasonable results, but I still dont understand why creating octrees and giving those as input didn't work.

Below I list my questions:

Q1.1 Is my configuration for ocnn autoencoder wrong? Is it expected that ocnn runs much faster because its more efficient than resnet? Do i have to train it for much longer time in order to get as good results as with resnet?

Q1.2 Regarding the input signal, in ae_resnet.yaml the input channel is 4 and I realized after some time that this is due to node_dis: True. Can you explain what this is?

Q1.3 When I run the decode_shape mode, and convert the input octrees into mesh with octree2mesh the original shape i get from ae_resnet.yaml and ae_ocnn.yaml are slightly different. Is this because I used 6 rotations in the octree generation thus octree with suffix "_6_2_000.octree" is not entirely same as octree created from point-cloud?

original image in resnet:
image

original image in ocnn:
image

Q2. When I tried to run autoencoder with adaptive points or octrees it happened that loss4 was extremely big resulting in total loss of nan and then training was stuck? Do you have a hint what the problem is? (note: for adaptive octrees i used the step 2 mentioned above with extra argument --adaptive 4)

Adaptive resnet:

test summaries:

image

config:

SOLVER:
  gpu: 0,
  logdir: /media/christina/Data/ANFASS_data/O-CNN/output/ocnn_completion/ae/aresnet_b16
  run: train
  max_iter: 20000
  test_iter: 336
  test_every_iter: 400
  step_size: (80000,)
  ckpt_num: 20

DATA:
  train:
    dtype: points
    depth: 6
    location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_train_points.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True
    adaptive: True

  test: 
    dtype: points
    depth: 6
    location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_test_points.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True
    adaptive: True

MODEL:
  name: resnet
  channel: 4
  nout: 32   # The channel of the hidden code, the code length is 4*4*4*32 (2048)
  depth: 6

LOSS:
  weight_decay: 0.0005
Adaptive ocnn:

test summaries:

image

config:

SOLVER:
  gpu: 0,
  logdir: /media/christina/Data/ANFASS_data/O-CNN/output/ocnn_completion/ae/aocnn_b16
  run: train
  max_iter: 20000
  test_iter: 336
  test_every_iter: 400
  step_size: (80000,)
  ckpt_num: 20

DATA:
  train:
    dtype: octree
    depth: 6
    location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_train_aoctrees.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True

  test: 
    dtype: octree
    depth: 6
    location: /media/christina/Data/ANFASS_data/O-CNN/ocnn_completion/completion_test_aoctrees.tfrecords
    batch_size: 16
    distort: False
    offset: 0.0
    node_dis: True
    split_label: True

MODEL:
  name: ocnn
  channel: 3
  nout: 32   # The channel of the hidden code, the code length is 4*4*4*32 (2048)
  depth: 6

LOSS:
  weight_decay: 0.0005

Q3. What does the output of check_octree means:

example of an octree generated with octree ... --depth 6 --split_label 1 --rot_num 6:

===============
../ocnn_completion/.../02691156/1a04e3eab45ca15dd86060f189eb133_6_2_000.octree infomation:
This is a valid octree!
magic_str:_OCTREE_1.0_
batch_size: 1
depth: 6
full_layer: 2
adaptive_layer: 4
threshold_distance: 2
threshold_normal: 0.1
is_adaptive: 0
has_displace: 0
nnum: 1 8 64 224 1000 4336 17320 0 0 0 0 0 0 0 0 0 
nnum_cum: 0 1 9 73 297 1297 5633 22953 22953 0 0 0 0 0 0 0 
nnum_nempty: 1 8 28 125 542 2165 8177 0 0 0 0 0 0 0 0 0 
total_nnum: 22953
total_nnum_capacity: 22953
channel: 1 1 0 3 0 1 
locations: -1 -1 0 6 0 -1 
bbox_max: 128.635 130.831 120.122 
bbox_min: 17.2621 19.4578 8.74912 
key2xyz: 0
sizeof_octree: 483992
===============

Q3.1. i guess adaptive_layer:4 is dummy because of is_adaptive: 0 ?!

Q3.2. what is the locations parameter showing?

Q3.3 what is the channel parameter showing? I understood it corresponds to split, label, feature, xyz, index and one more property...which is this property and what is their order?

Q1.1 Is my configuration for ocnn autoencoder wrong?

Your configuration is correct.

Q1.1 Is it expected that ocnn runs much faster because its more efficient than resnet?

Yes, the ocnn contains much less convolution layers than the resnet, and it runs much faster.

Q1.1 Do i have to train it for much longer time in order to get as good results as with resnet?

The ocnn contains less convolution layers, as well as less trainable parameters. The fitting ability of a shallow ocnn is weaker. So the performance can not compete with resnet even with longer training time.

Q1.2 Regarding the input signal, in ae_resnet.yaml the input channel is 4 and I realized after some time that this is due to node_dis: True. Can you explain what this is?

In the ae_resnet.yaml, the parameter node_dis is True, the octree nodes contain 4 channel signals (normal: 3 channel + local displacement: 1 channel), which can be used to represent a local planar patch in each octree node, i.e. the normal represents the orientation of the patch and the displacement represents the distance between the patch and the center of the corresponding octree node.

Q1.3 When I run the decode_shape mode, and convert the input octrees into mesh with octree2mesh the original shape i get from ae_resnet.yaml and ae_ocnn.yaml are slightly different. Is this because I used 6 rotations in the octree generation thus octree with suffix "_6_2_000.octree" is not entirely same as octree created from point-cloud?

When runing the tool octree, you can try the following command: octree --filenames category/list.txt --output_path shape.octrees/category --depth 6 --split_label 1 --rot_num 6 --node_dis 1. By default, the node_dis is off for the tool octree. Then the generated octrees will be the same.

Q2. When I tried to run autoencoder with adaptive points or octrees it happened that loss4 was extremely big resulting in total loss of nan and then training was stuck? Do you have a hint what the problem is?

The autoencoder for adaptive octree generation is implemented with Caffe. I have not tested the adaptive octree with TensorFlow. I read your configuration file, and it is OK. But there may be errors in the code.

Q3. What does the output of check_octree means:

The check_octree outputs some information for debugging purposes.

Q3.1. i guess adaptive_layer:4 is dummy because of is_adaptive: 0 ?

Yes, when the is_adaptive is 0, adaptive_layer:4 is dummy.

Q3.2. what is the locations parameter showing?

The location indicates which octree level the corresponding octree properties exists. If location is 5, it means the property exists in octree level 5. If location is -1, it means the property exists in every octree level.

Q3.3 what is the channel parameter showing? I understood it corresponds to split, label, feature, xyz, index and one more property...which is this property and what is their order?

The property is defined in https://github.com/microsoft/O-CNN/blob/master/octree/octree/octree_info.h

enum PropType {
   kKey = 1, kChild = 2, kNeigh = 4, kFeature = 8, kLabel = 16, kSplit = 32
 };

There is another property for the neighborhood searching of octree nodes, which is used when doing convolution.

thank you!