Question when aggregating predictions

Question

Question when aggregating predictions

mpelchat04 opened this issue 5 years ago · comments

Hi,
I'm trying to use SPG on airborne lidar data and the code adaptation is really easier than I would have thought, given the complexity of the work. Great work, your code is very complete and clean.
Anyway, here's my question:
In learning/main.py I'm wondering why the predictions aggregation is being done on index 0 of the resulting array. Here's the bit of code:

        # aggregate predictions (mean)
        for fname, lst in collected.items():
            o_cpu, t_cpu, tvec_cpu = list(zip(*lst))
            if args.test_multisamp_n > 1:
                o_cpu = np.mean(np.stack(o_cpu,0),0)
            else:
                o_cpu = o_cpu[0]
            t_cpu, tvec_cpu = t_cpu[0], tvec_cpu[0]
            predictions[fname] = np.argmax(o_cpu,1)
            o_cpu, t_cpu, tvec_cpu = filter_valid(o_cpu, t_cpu, tvec_cpu)

This code is ok if you have multiple o_cpu of the same size, but it's not always the case, depending on the input sampling... right?
Or is it just my comprehension of the test_multi_samp_n param that is off.

Thanks,
Math

Loic Landrieu · Answer 1 · Fri Feb 07 2020 00:53:41 GMT+0800 (China Standard Time)

Hi,

Glad the code was easy for you to appropriate!

The role of test_multi_stamp_n is to run the inference several time and average the logits of each superpoint. The reasoning is:

inference is fast so we might as well run it a few times
inference is stochastic (random sampling of each superpoint) and this should avoid errors due to unlucky samplings.

In practice it doesn't really make a significant difference, you could remove it entirely (value of 1) and see almost no drop.

Now, the size of o_cpu will always be the number of superpoints in the SPG, so it will always be the same at each run. While its correct that for training we train on a sub graph of the full spg, at inference time we load the entire spg.

Hope this clears things up, otherwise dont hesitate to ask for further clarifications.

mpelchat04 · Answer 2 · Fri Feb 07 2020 01:05:31 GMT+0800 (China Standard Time)

Thanks for the quick response!
It does really clear things up.
I see why the different o_cpu values I get at inference aren't suppose to happen, since the entire spg is loaded...

Updated: Turns out the train flag was set to True for both validation and test datasets. That's why I got sampled SPGs at test time.

Cheers,