wmayner / pyemd

Fast EMD for Python: a wrapper for Pele and Werman's C++ implementation of the Earth Mover's Distance metric

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Thresholded Ground Distance

ksanjeevan opened this issue · comments

Hi, the ICCV paper makes use of a thresholded ground distance d_t(a,b) = min(t, d(a,b)) to better the time complexity, does this need to be reflected in the distance_matrix before passing it as an argument? If so what value do we set for the threshold?

Yes, if you set a threshold it must be reflected in the distance matrix—there's nothing in the code about thresholds. It's mentioned because it's one of the main points in their paper. As for the value, I don't think there's a meaningful notion of a good threshold value without reference to a particular application. I would recommend experimenting with different values; ideally you'd use the smallest value that doesn't negatively impact accuracy, whatever that might mean in your case.

So here is the part that confuses me, from the paper:

The transformation first removes all edges
with cost t. Second, it adds a new transhipment vertex. Finally
we connect all sources to this vertex with edges of cost
t and connect the vertex to all sinks with edges of cost 0.

My understanding is that by not considering edges with weight t the algorithm can have linear performance, so how do I 'mark' these edges for the algorithm not to try them?

You don't need to. The threshold for any distance matrix D is just max(D), so their code simply uses that.

To clarify my previous comment: there's nothing in the arguments of any of the functions that has to do with thresholds.