Results Using OpenCV DNN Module

Question

Results Using OpenCV DNN Module

opened this issue 5 years ago · comments

I had a hard time getting Caffe installed, so I figured I'd try out your model using openCV's dnn.readNetFromCaffe() along with your .caffemodel and .prototxt.

The output of net.forward() when using this method is a 1x21x300x300 matrix which can be squeezed to 21x300x300. Each of the 21 300x300 arrays, when normalized, seem to constitute a type of heat map. Some of these can be seen below.

My question is, how would I combine these to get the actual face segmentation? I tried to parse your code to see if I could figure it out but fell short of understanding. Thanks!

Yuval Nirkin · Answer 1 · Mon Nov 11 2019 22:42:54 GMT+0800 (China Standard Time)

The 300x300 model has only two output channels.

Deleted user · Answer 2 · Mon Nov 11 2019 23:09:27 GMT+0800 (China Standard Time)

Does that mean the output should only be a 2x300x300 and not a 21x300x300? Then, in the following block of code from face_seg.py, are you simply taking the maximum of the two channels?

# run net and take argmax for prediction
net.forward()
out = net.blobs['score'].data[0].argmax(axis=0)

The end result must be a binary mask, but all of the outputs I'm seeing in the 21x300x300 array I get are float values, many of which are even negative. I'm wondering if the output of net.forward() may be different in the actual Caffe library and in openCV's readFromCaffe. I'm not sure though...

For reference, here is my code:

import numpy as np
import cv2

image = cv2.imread('Alison_Lohman_0001.jpg')

# Define prototext and caffemodel paths, and create model
caffeModel = "face_seg_fcn8s.caffemodel"
prototextPath = "face_seg_fcn8s_deploy.prototxt"
net = cv2.dnn.readNetFromCaffe(prototextPath,caffeModel)

# Resize to 300x300
image = cv2.resize(image,(300,300))
# blobImage convert RGB (104.00698793,116.66876762,122.67891434)
blob = cv2.dnn.blobFromImage(image,1.0,(300,300),(104.00698793,116.66876762,122.67891434))

# Passing blob through network
net.setInput(blob)
output = net.forward()

output is a 1x21x300x300 float32

Deleted user · Answer 3 · Sat Nov 16 2019 01:01:25 GMT+0800 (China Standard Time)

Ah I see what you mean about the 300x300 model. It produces a 1x2x300x300 output.

I'm still unsure of what to do with the two resulting images. After normalization, they look like this:

for the Alison_Lohman_0001.jpg image

kiralygomba · Answer 4 · Tue Jun 23 2020 20:02:38 GMT+0800 (China Standard Time)

Did you find the answer to this? I'm trying to use it in openCV as well, same problems.

Deleted user · Answer 5 · Wed Jul 01 2020 04:25:32 GMT+0800 (China Standard Time)

@kiralygomba no unfortunately not.

charleswg · Answer 6 · Mon Feb 08 2021 13:38:01 GMT+0800 (China Standard Time)

@kiralygomba looks like the only thing missing is the last operation:
mask=output[0].argmax(axis=0)
mask=1*(mask>0)

Though I must say it doesn't quite work as I thought