microsoft / O-CNN

O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not able to reproduce reported results for adaptive O-CNN classification and autoencoder

jeichelbaum opened this issue · comments

EDIT: managed to reproduce classification accuracy and autoencoder chamfer distance as reported in paper

I am able to reliable reproduce the results for the full O-CNN classification experiment, but not for the adaptive O-CNN classification and autoencoder experiment. I gathered the results you reported and the results I was able to reproduce in a table below:

depth 5 depth 6 depth 7
Reported AO-CNN classification accuracy 90.5% 90.4% 90.0%
Caffe AO-CNN accuracy 85.12% 85.68% 86.18%
Tensorflow AO-CNN accuracy 83.38% 83.72% 84.31%
Reported autoencoder avg. Chamfer dist. - - 1.44
Reported autoencoder avg. Chamfer dist. - - 1.77

Classification experiment

I am using the ModelNet40 point dataset you provide and followed the instructions exactly as you posted them in docs/classification. I used both the Caffe and Tensorflow implementation to test this, but didn't get accuracy close to your results. The model I used is the same you uploaded in caffe/experiments/aocnn_m40_5.prototxt and tensorflow/script/configs/cls_octree.yaml I attached a visualization of the Caffe model for convenience:
train

Going through the Tensorflow code spawned some question:

  1. Does the Tensorflow classifier implementation support adaptive octrees as input?
  2. Did you use different parameters to construct the octrees other than those mentioned in caffe/experiments/dataset.py?
  3. What signal size did you use in the adaptive O-CNN classifier experiment? The config files use signal_size=3 (just normal), but from your paper I get the idea that you might have used signal_size = 4 (normal + displacement). I already tried the experiment with both signal_sizes but without success.

Autoencoder experiment

Again I used the ShapeNet point cloud you provided in combination with the provided caffe/experiments/ae_7_4.train.prototxt and executed each step as you describe it in the docs section, but my results are still way off.

My big question is:

Do you have any pointers for me, how I can get closer to reproducing your experiments? Or would you be so kind to share the trained models for both the Caffe adaptive OCNN classifier and autoencoder?

Thanks in advance

  1. For the adaptive o-cnn, the basic component is already implemented in our Tensorflow-based implementation. However, I have not tested the adaptive o-cnn on tensorflow.

  2. I have updated the dataset.py and fixed one parameter. Please pull the latest code and have a try. With depth-5 adaptive o-cnn, the classification accuracy is 89.4% before voting. After using the voting strategies, such as orientation pooling as mentioned in our paper, the performance will increase to about 90.5%. I have just run the experiment on my own PC today. The trained log and the last caffemodel can be downloaded here. Other results will be released soon.

  3. In the classification experiments, we use only the first 3 channals. According to my own experiments, with 4-channel signals, the testing accuracy may drop about 0.2, probably due to overfitting. In the autoencoder experiments, 4-channel signals are used.

I implemented voting to see if I am able to reproduce the accuracy for the adaptive O-CNN classification experiment. I am seeing some improvements after your last commit, but I am still not able to completely reproduce your results. I am on the latest version of the O-CNN repository and did a clean installation of Ubuntu and Caffe.

depth 5 depth 6 depth 7
Reported AO-CNN classification accuracy with orientation pooling 90.5% 90.4% 90.0%
Reproduced Caffe AO-CNN classification accuracy with average voting 89.7% 89.2% 89.1%
Reproduced Caffe AO-CNN classification accuracy without voting 89.1% 88.9% 88.6%

I tested the model you uploaded in your previous comment and it performs better than any of my trained models. It achieves 89.4% without and 90.4% with voting on my machine. Is this something that is expected due to random initialization, or do you think I might still be doing something wrong?

  1. Please compare your training log with the one I uploaded in my last post. And you can run several times and test the accuracies.

  2. With a simple average voting strategy, the accuracy may increase by about 0.5%. With orientation pooling (for the detailed implementation (please refer to Section 4.4 of this paper ), the accuracy may increase by about 1.0%.

  1. I ran the experiment multiple times, but it doesn't change much. The behavior over time is consistent with your training log, but I always seem to miss 0.2-0.3% in accuracy. In my opinion this is tolerable and a lot better than the 85.1% accuracy of my initial training round.

  2. Thank you for clarifying and walking me through the process to reproduce your results! Orientation pooling explains the gap in accuracy and concludes the classification experiment for me.

  3. I finally managed to reproduce the AO-CNN autoencoder experiment, the resulting Chamfer distance is 1.45

I have pushed the latest code to the master branch.
I automated the process to do the autoencoder experiment with python. I did the experiment last week, the results are reproducible. The trained log, model, and resulting chamfer distance can be downloaded here.