robwhess / opensift

Open-Source SIFT Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Strange piece of code (bug or small optim)

cgabard opened this issue · comments

I found a strange piece of code in sift.c (line 251 to 263) : https://github.com/robwhess/opensift/blob/master/src/sift.c#L251

/*
precompute Gaussian sigmas using the following formula:

\sigma_{total}^2 = \sigma_{i}^2 + \sigma_{i-1}^2
*/
sig[0] = sigma;
k = pow( 2.0, 1.0 / intvls );
for( i = 1; i < intvls + 3; i++ )
{
  sig_prev = pow( k, i - 1 ) * sigma;
  sig_total = sig_prev * k;
  sig[i] = sqrt( sig_total * sig_total - sig_prev * sig_prev );
}

We can see that :

sig[0] = sigma
sig[i] = sqrt(sig_total * sig_total - sig_prev * sig_prev )
       = sqrt(sig_prev * sig_prev * k * k - sig_prev * sig_prev )
       = sig_prev * sqrt(  k * k -  1 )
       = k^(i-1) * sigma * sqrt(  k * k -  1 )

then

sig[0] = sigma
sig[1] = sigma * sqrt(  k * k -  1 )
sig[2] = k * sigma * sqrt(  k * k -  1 ) = k* sig[1] 
sig[3] = k^2 * sigma * sqrt(  k * k -  1 ) = k* sig[2] 
...

This lines can then we rewrite with less calculation :

k = pow( 2.0, 1.0 / intvls );
double sk2m1 = sqrt( k*k- 1 );
sig[0] = sigma;
sig[1] = sigma * sk2m1;
for (int i = 2; i < intvls + 3; i++)
    sig[i] = sig[i-1] * k;

In the first code it seems sig_prev != sig[i-1], but sig_prev is used instead of sig[i-1] in the formula sqrt(sig_total * sig_total - sig_prev * sig_prev ).

@cgabard yes, sig[i-1] should not equal sig_prev. Because we are convolving the previous image in the scale space pyramid to get the current one, each sig[i] represents only the incremental sigma needed to get from the total sigma applied to the previous image (i.e. sig_prev) to the total sigma needed for this image (i.e. sig_total). Sorry the comment is a bit vague about what's actually going on here. It only provides the formula that eventually leads to getting the correct incremental sigmas without actually being explicit about what the computation is doing.

It looks like you're right about the simplification of the math involved here. Would you want to submit a pull request with that optimization?

It looks like you're right about the simplification of the math involved here. Would you want to submit a pull request with that optimization?

Yes i will do that

@robwhess I think it is actually a bug. For example, if you choose number of intvls to be 2, so that k = sqrt(2), then sig[0] = sig[1]. According to Lowe, they should not be the same (i.e. sig[i] = k*sig[i-1]). What is the motive behind the sig_total and sig_prev stuff? Why not just do what it says in Lowe's paper? It looks like you are using properties of cascading gaussians (sig^2 = sig1^2 + sig2^2) but it is not giving the desired effect. Or maybe I am missing something.

@chrislgarry each sig[i] is the incremental sigma used achieve the desired sigma (sig_total) for level i. In other words, the image at level i-1, which has total sigma sig_prev, is convolved with sigma sig[i] to achieve a total effective sigma of sig_total at level i. So, if you have invls = 2, sig[0] = sig[1] = sqrt(2) would achieve a total sigma of 2 (sqrt(2) * sqrt(2)) at level 1 in the pyramid, which I believe is the desired effect.

@chrislgarry in other words, this is doing exactly what it says in Lowe's paper. The key is that the approach I've coded up is a more efficient implementation than just continually convolving the base image of the octave with an increasing sigma because it helps keep the convolution kernel small.

@robwhess Ah ok. Gotcha. I've submitted a pull request for the optimization. Your SIFT implementation is actually being used in OpenCV (pretty cool if you ask me), and someone filed a bug for what I just asking you about. I've closed that bug and will also push these optimization changes upstream.

Fixed by #9.