Visualization Script

Question

Visualization Script

renwang435 opened this issue 5 years ago · comments

Hi Loic,

Thank you for this work! I had a quick question about the visualization script. I am trying to visualize your trained model on Semantic3D. The example you provide is:

python partition/visualize.py --dataset sema3d --ROOT_PATH $SEMA3D_DIR --res_file 'results/sema3d/trainval_best/prediction_testred' --file_path 'test_reduced/MarketplaceFeldkirch_Station4' --output_type ifprs

I can't seem to tell from visualize.py what this script actually outputs. So

What exactly are the possible outputs of visualize.py?
How can I take these outputs and get the segmented point clouds you demonstrate in your paper?

Loic Landrieu · Answer 1 · Wed Jan 08 2020 17:39:49 GMT+0800 (China Standard Time)

Hi,

from the README:

output_type defined as such:

'i' = input rgb point cloud
'g' = ground truth (if available), with the predefined class to color mapping
'f' = geometric feature with color code: red = linearity, green = planarity, blue = verticality
'p' = partition, with a random color for each superpoint
'r' = result cloud, with the predefined class to color mapping
'e' = error cloud, with green/red hue for correct/faulty prediction
's' = superedge structure of the superpoint (toggle wireframe on meshlab to view it)

Add option --upsample 1 if you want the prediction file to be on the original, unpruned data (long).

Renhao Wang · Answer 2 · Thu Jan 09 2020 00:30:43 GMT+0800 (China Standard Time)

Thank you for your prompt response Loic! I think the core of my question is I'm unsure what the script actually outputs. Does it write the segmented point clouds in RGB to some txt format? I think I saw some .ply outputs as well; would I have to them post process these PLY files somehow to get proper point cloud landscapes? Thank you very much again!

Loic Landrieu · Answer 3 · Thu Jan 09 2020 00:53:16 GMT+0800 (China Standard Time)

It outputs ASCII ply in the ..\clouds folder where you stored your data. You can directly open them with meshlab or CloudCompare.

Renhao Wang · Answer 4 · Thu Jan 16 2020 02:42:04 GMT+0800 (China Standard Time)

Thank you Loic! Apologies as I have another question: If I wanted to output a predictions.h5 file on the validation set (you used a custom 11/4 train/val split), how would I go about doing this? I want to be able to look at ground truth point cloud as well as the error file, which can only be done on the 15 training point clouds, but your code doesn't appear to allow for this.

Loic Landrieu · Answer 5 · Thu Jan 16 2020 18:06:07 GMT+0800 (China Standard Time)

python partition/visualize.py --dataset s3dis --ROOT_PATH $S3DIR_DIR --res_file results/sema3d/YOUR_FOLDER/training/predictions_test --file_path training/THE_FILE_YOU_WANT_TO_VISUALIZE --output_type gre

should work. if not, paste here your error message

Renhao Wang · Answer 6 · Fri Jan 17 2020 09:53:45 GMT+0800 (China Standard Time)

As always, thank you for your prompt response! There doesn't seem to be a training folder, which is why I'm not sure about that argument to --res_file. I only see predictions_testred.h5 and predictions_testfull.h5, which are I believe the predictions on the reduced and full test sets, not the 15 training point clouds.

Loic Landrieu · Answer 7 · Fri Jan 17 2020 23:24:45 GMT+0800 (China Standard Time)

ok I understand now. Add this function in main.py and call it at the end of the main loop. This is untested code so you might have to play with it a bit to get it to work. Let me know if it worked or if you get stuck.

    def eval_on_train():
        """ Evaluated model on train set """
        hf = h5py.File(os.path.join(args.odir, 'predictions_train.h5'), 'w')
        model.eval()
        loader = torch.utils.data.DataLoader(train_dataset, batch_size=1, collate_fn=spg.eccpc_collate, num_workers=args.nworkers)
            
        if logging.getLogger().getEffectiveLevel() > logging.DEBUG: loader = tqdm(loader, ncols=65)

        # iterate over dataset in batches
        for bidx, (targets, GIs, clouds_data) in enumerate(loader):
            model.ecc.set_info(GIs, args.cuda)
            embeddings = ptnCloudEmbedder.run(model, *clouds_data)
            outputs = model.ecc(embeddings)
            o_cpu = outputs.data.cpu().numpy()

           fname = clouds_data[0][0][:clouds_data[0][0].rfind('.')]
           hf.create_dataset(name=fname, data=np.argmax(o_cpu,1))
    return

then:

python partition/visualize.py --dataset s3dis --ROOT_PATH $S3DIR_DIR --res_file results/sema3d/YOUR_FOLDER/predictions_train --file_path training/THE_FILE_YOU_WANT_TO_VISUALIZE --output_type gre

should work

Renhao Wang · Answer 8 · Sat Jan 18 2020 01:48:21 GMT+0800 (China Standard Time)

Hi Loic, thank you for that function! I was able to match it up with eval_final() and it appeared to run without problems. However, there's an error I'm seeing in visualize.py now. I get ValueError: It looks like the spg is not adapted to the result file. I've traced it to this particular section:

if (par_out or res_out) and (not os.path.isfile(spg_file)):    
    raise ValueError("%s does not exist and is needed to output the partition  or result ply" % spg_file) 
else:
    graph_spg, components, in_component = read_spg(spg_file)
if res_out or err_out:
    if not os.path.isfile(res_file):
        raise ValueError("%s does not exist and is needed to output the result ply" % res_file) 
    try:
        pred_red  = np.array(h5py.File(res_file, 'r').get(folder + file_name))        
        if (len(pred_red) != len(components)):
            raise ValueError("It looks like the spg is not adapted to the result file") 
        pred_full = reduced_labels2full(pred_red, components, len(xyz))
    except OSError:
        raise ValueError("%s does not exist in %s" % (folder + file_name, res_file))

I think your models trained fine, so the superpoint graph components should be the correct length, but I'm not sure that your code was adapted to simply read in predictions_train.h5 directly through h5py like that (maybe?). Would be great getting your help debugging this. The command I run is:

python partition/visualize.py --dataset sema3d --ROOT_PATH $SEMANTIC_DIR --res_file 'results/sema3d/trainval_best/predictions_train' --file_path 'train/bildstein_station3' --output_type gre

Loic Landrieu · Answer 9 · Mon Jan 20 2020 20:56:22 GMT+0800 (China Standard Time)

This error message means that the size of the prediction is different from the size of the partition.

can you rebuild the parsed files (with sema3d_dataset.py) and rerun the inference + visualization script? (no need to retrain)
if the error is still here, could you print pred_red.shape and len(component) just before the error?

Renhao Wang · Answer 10 · Tue Jan 21 2020 04:08:47 GMT+0800 (China Standard Time)

Hi Loic,

Thank you for your continued help. I've rebuilt the parsed files and re-run the inference. Upon running the visualization command above (python partition/visualize.py --dataset sema3d --ROOT_PATH $SEMANTIC_DIR --res_file 'results/sema3d/trainval_best/predictions_train' --file_path 'train/bildstein_station3' --output_type gre), I still get the same error. It looks like len(pred_red) = 649 and len(components) = 2031. Would you be able to confirm that these are the numbers you also see on that end and I'm not doing something incorrect in the middle?

Loic Landrieu · Answer 11 · Wed Jan 22 2020 01:36:27 GMT+0800 (China Standard Time)

Hi,

I finally had access to my station to try things out. I have a quick and dirty fix, which you would need to wrap in an extra option if that is something you want to do regularly. But it works on my machine and don't require extra code.

You can just change line 277 of main.py in eval_final from:
test_dataset_ss = create_dataset(args, ss)[1]
to
test_dataset_ss = create_dataset(args, ss)[2]

(explanation: the first argument is the train, then test, and then eval). This will write the results on the eval test inside predictions_testfull.

Then I was able to produce the prediction and error file with the following command:

python partition/visualize.py --dataset sema3d --ROOT_PATH  $SEMANTIC_DIR --res_file 'results/sema3d/trainval_best/predictions_testfull' --file_path 'train/bildstein_station3' --output_type gre

Let me know if this works for you as well. If you still have ValueError: It looks like the spg is not adapted to the result file, you should delete your parsed files ,and build them again. The number of superpoints must be consistent with the one in the superpoint_graph file, you can directly check by opening the .h5 files.

Renhao Wang · Answer 12 · Fri Jan 24 2020 11:12:55 GMT+0800 (China Standard Time)

Hi Loic,

I've proceeded as you've described, and there were some bugs along the way.

Firstly, the file sema3d_dataset.py on the ssp+spg branch references pathD = '{}/features_supervision/{}/'.format(SEMA3D_PATH, n) but I don't believe Semantic3D has been prepared for supervised partitions yet, so I changed this to the release branch version: pathD = '{}/features/{}/'.format(SEMA3D_PATH, n). Then, I was able to rebuild the parsed files using python learning/sema3d_dataset.py.

Secondly, I trained a network from scratch using CUDA_VISIBLE_DEVICES=0 python learning/main.py --dataset sema3d --SEMA3D_PATH ../sem8_data_dir --db_test_name testred --db_train_name trainval --epochs 500 --lr_steps '[350, 400, 450]' --test_nth_epoch 100 --model_config 'gru_10,f_8' --ptn_nfeat_stn 11 --nworkers 0 --pc_attrib xyzrgbelpsv --odir "results/sema3d/trainval_best. Then, I tested the network by setting epochs -1 and --resume RESUME, as before. Here is where I noticed the following line being printed after the model is loaded and the multisampling test procedure begins:

Train dataset: 15 elements - Test dataset: 4 elements - Validation dataset: 0 elements

For some reason, it appears that the validation dataset is empty, so modifying line 277 within learning/main.py would just reference an empty dataset. I tried changing the line from test_dataset_ss = create_dataset(args, ss)[1] to test_dataset_ss = create_dataset(args, ss)[0], since the validation point clouds are included within the 15 training point clouds, but then I ran into this bug:

Traceback (most recent call last):
  File "learning/main.py", line 459, in <module>
    main()
  File "learning/main.py", line 381, in main
    acc_test, oacc_test, avg_iou_test, per_class_iou_test, predictions_test, avg_acc_test, confusion_matrix = eval_final()  File "learning/main.py", line 296, in eval_final
    o_cpu = np.mean(np.stack(o_cpu,0),0)
  File "/home/r/renhaow/.local/lib/python3.6/site-packages/numpy/core/shape_base.py", line 416, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

Do you have any insight into what might be going wrong here?

Loic Landrieu · Answer 13 · Fri Jan 24 2020 17:53:39 GMT+0800 (China Standard Time)

Hi,

i) thanks for pointing out the bug with supervized partition, I have corrected it.

ii) use --use_val_set 1 and test_dataset_ss = create_dataset(args, ss)[2]

iii) if the bug persists (which shouldn't as I have tested this one on several stations), could you print the shape of the o_cpu? Something like:

print([x.shape for x in o_cpu])

Renhao Wang · Answer 14 · Tue Jan 28 2020 03:34:21 GMT+0800 (China Standard Time)

That worked perfectly! Thank you Loic for all your assistance! Closing this issue now.