esimov / pigo

Fast face detection, pupil/eyes localization and facial landmark points detection library in pure Go.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Infinite loop

titpetric opened this issue · comments

This particular line is a possible (an in fact, experienced) infinite loop:

scale = int(float64(scale) * cp.ScaleFactor)

I've figured it out from the stack trace:

github.com/esimov/pigo/core.(*Pigo).classifyRegion(0xc0001a40a0, 0x42, 0x36, 0x9, 0xc00068e000, 0x238c, 0x238c, 0x5b, 0xbf800000)

The gorutine track trace parameters are as follows:

  • 0xc0001a40a0 (the receiver)
  • 0x42 row (constantly modified)
  • 0x36 col (constantly modified)
  • 0x9 scale (!!does not change!!)
  • other params (do not change)

The issue comes from a rounding /casting error. The main readme (as well as my code) sets the cascade parameters scale factor as 1.1. When the scale is 9, the factor would come up with 9.9, which again becomes 9 because of the int() conversion.

The particular issue is that the source image was 91x100 (don't even ask), and I've calculated the face min/max size as 10% and 50% of the image width respectively. The minimum face size thus becomes 9 and with the scale ratio of 1.1 = infinite loop.

Possible fixes:

  • Clamp the minimum value for a face size where it wouldn't cause problems (10px)
  • Better main loop for this stuff to avoid rounding errors

I have tried with a sample image with the same dimension as you mentioned, but I haven't encountered any infinite loop. Could you show me the command you have used with the applied parameters. Also it would be great if you could give me the sample image.

@titpetric thanks for notifying about this issue, really appreciated. I have fixed it by clamping to the maximum value when the rounded value exceed 9.

The scale factor must always be greater than 1.05. This is enforced by the CLI application.

log.Fatal("Scale factor must be greater than 1.05")

Though if I consider that the Pigo core part can be invoked in a completely separate application, a better option would be to handle these edge cases directly on the core API. I will take care about this.

Your approach works in case the scale is above 1.0, but will not overcome the infinite loop when the scale is below or equal to 1.0. I have fixed this issue by ceiling the computed scaling values. This works because at the start of the iteration the difference of the fractional part of two newly computed floating numbers are higher and getting lower and lower towards the end of the iteration. The infinite loop problem arise at the very start of the iteration process. Another trick to speedup the process was to use a higher dividend, in my case 100.

Previously, the first 10 iterations playground.

#0	20
#1	22
#2	24
#3	26
#4	28
#5	30
#6	33
#7	36
#8	39
#9	42

And the new calculation playground

#0	20
#1	26
#2	31
#3	35
#4	39
#5	42
#6	45
#7	48
#8	51
#9	54

Face detection cpu usage will be thus more intense towards the end of the calculation (after 37 iterations, size is always incremented by +1). The previous algorithm actually decreased the intensity towards the maximum face detect size with larger deltas, as it should.

Both algorithms are flawed for different reasons.

I suggest:

		scale = int(float64(scale) + math.Max(2, (float64(scale)*cpScaleFactor)-float64(scale))

The output here is identical to the previous implementation, except the value is "clamped" to iterate by at least two pixels, to avoid the infinite loop, other behaviour is unchanged. You must clamp cp.ScaleFactor with if cp.ScaleFactor < 1.0 { cp.ScaleFactor = 1.0 }

Are you sure? This is what I'm getting running the profiler:

My solution Your solution
using_ceil using_max
flat  flat%   sum%        cum   cum%
     240ms 47.06% 47.06%      350ms 68.63%  github.com/esimov/pigo/core.(*Pigo).classifyRegion
      60ms 11.76% 58.82%       70ms 13.73%  compress/flate.(*compressor).findMatch
      30ms  5.88% 64.71%       30ms  5.88%  math.IsInf
      30ms  5.88% 70.59%      100ms 19.61%  math.pow
      20ms  3.92% 74.51%       90ms 17.65%  compress/flate.(*compressor).deflate
      20ms  3.92% 78.43%       20ms  3.92%  golang.org/x/image/draw.(*Kernel).transform_RGBA_NRGBA_Src
      10ms  1.96% 80.39%       10ms  1.96%  compress/flate.(*dictDecoder).writeByte
      10ms  1.96% 82.35%       10ms  1.96%  compress/flate.matchLen
      10ms  1.96% 84.31%       10ms  1.96%  github.com/esimov/pigo/core.(*Pigo).Unpack
      10ms  1.96% 86.27%       10ms  1.96%  github.com/esimov/pigo/core.(*Pigo).classifyRegion.func1

vs

      flat  flat%   sum%        cum   cum%
     380ms 59.38% 59.38%      460ms 71.88%  github.com/esimov/pigo/core.(*Pigo).classifyRegion
      50ms  7.81% 67.19%       60ms  9.38%  compress/flate.(*compressor).findMatch
      50ms  7.81% 75.00%       70ms 10.94%  math.pow
      30ms  4.69% 79.69%      490ms 76.56%  github.com/esimov/pigo/core.(*Pigo).RunCascade
      20ms  3.12% 82.81%       40ms  6.25%  image/png.filter
      10ms  1.56% 84.38%       10ms  1.56%  bufio.(*Reader).ReadByte
      10ms  1.56% 85.94%       10ms  1.56%  compress/flate.(*huffmanEncoder).bitCounts
      10ms  1.56% 87.50%       10ms  1.56%  compress/flate.matchLen
      10ms  1.56% 89.06%       10ms  1.56%  github.com/esimov/pigo/core.(*Pigo).classifyRegion.func1
      10ms  1.56% 90.62%       10ms  1.56%  github.com/esimov/pigo/core.RgbToGrayscale

I have done some more additional tests comparing the two methods. There is a trade-off using my approach comparing with yours in terms of detection score. My approach results in a slightly better performance, because as I explained above will reduce the iteration steps towards the end of the process, however will miss some faces in case of smaller detection window size. However this can be overcome by using 1 as dividend instead of the arbitrary selected value of 100, but in this case maybe your solution is more elegant.

Here are some results with the different approaches:

scale = int(float64(scale) + math.Ceil(1/float64(scale)*cp.ScaleFactor))

Done in: 0.75s

out

scale = int(float64(scale) + math.Max(2, (float64(scale)*cp.ScaleFactor)-float64(scale)))

Done in: 0.42s

out

scale = int(float64(scale) + math.Ceil(100/float64(scale)*cp.ScaleFactor))

Done in: 0.12s

out

What do you think which one should be the best approach?

I will test it further with the real time version to see if it has any impact on the speed.

@titpetric I gave another round of test running the real time version of the library using the proposed solutions, but seems that your proposal produced higher computational speed over my solution without having an impact over the detection score. So I opted to stick with your method. I really want to thank you for the effort and time you have allocated for fixing this issue!