Data Preprocessing?

Question

Data Preprocessing?

DCurro opened this issue 8 years ago · comments

Hi UPC, I have your model running, but I can't seem to get matching sentiment results to the images from the figures in your preprint. Was there a preprocessing step that I need to perform on my puppy image (such as subtracting the average-image)?

Thank you, and I loved the layer ablation!

Víctor Campos · Answer 1 · Thu Jun 02 2016 21:13:08 GMT+0800 (China Standard Time)

Hi Domenic,

Yes, the mean image from ILSVRC2012 is subtracted. This is automatically handled by Caffe, but you need to set the mean file first using caffe.io.Transformer.set_mean(). There is an example of the whole preprocessing in generate_sentiment_maps.py (line 39).

You can download the mean file to use (in .npy format) from here.

I am glad you liked our work!

Domenic Curro · Answer 2 · Fri Jun 03 2016 04:06:16 GMT+0800 (China Standard Time)

Hi Victor,

Thanks for the fast reply, my results are far more correct! Could I ask you for a favour? I cropped out the kitten, dog, fire-pit, and house-fire (from this figure SentimentMaps.png) to see if I could replicate the negative/positive results. Do you think I could know the results of these four images from your experiments?

These are the results I received:
kitten..........-> [ 0.60985792 0.39014202]
dog............-> [ 0.61151075 0.38848922]
fire-pit........-> [ 0.85933101 0.14066896]
house-fire..-> [ 0.94442993 0.05557004]

My code is simple:

deploy_path = 'deploy.prototxt'
caffemodel_path = 'twitter_finetuned_test4_iter_180.caffemodel'
net = caffe.Net(deploy_path, caffemodel_path, caffe.TEST)

mean_file = 'ilsvrc_2012_mean.npy'

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_mean('data', np.load(mean_file).mean(1).mean(1))
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)

image_path = 'kitten.png'
im = caffe.io.load_image(image_path)

out = net.forward_all(data=np.asarray([transformer.preprocess('data', im)]))

Here are the exact cropped images that I pulled from the diagram:

Víctor Campos · Answer 3 · Sun Jun 05 2016 21:20:02 GMT+0800 (China Standard Time)

Hi Domenic,

Your code looks fine. I don't have the exact results for those images (they probably belong to different cross-valiation subsets, so it may not make sense to do tests on them), but with that code you should be able to use the model on your own images.

Domenic Curro · Answer 4 · Wed Jun 15 2016 11:05:53 GMT+0800 (China Standard Time)

Hi Victor,

Thank you for the previous advice! I think I am a lot closer to getting it working correctly! I got a change to validate the classification model, using the twitter 5-agree data set.

I ran each image through the model and got:

true positives: 393
false positives: 93
true negatives: 40
false negatives: 77

I get an f1-score of 82.4% for the positive class and an f1-score of 32.0% for the negative score. The positive case closely resembles your score: 0.830 ± 0.034. Is this correct?

Víctor Campos · Answer 5 · Wed Jun 15 2016 16:44:14 GMT+0800 (China Standard Time)

Hi Domenic,

Which models are you using for the cross-validation? We just released one of them (out of five). Please have in mind that the idea of cross-validation is to always evaluate your model on unseen data, which is not happening when you use the same model for every fold.

The evaluation metric that we used in the paper is accuracy, computed over the whole test set (for both classes).

Domenic Curro · Answer 6 · Thu Jun 16 2016 02:08:02 GMT+0800 (China Standard Time)

Hi Victor,

Thank you so much for all the help, I was successfully able to reproduce your results! My colleagues really love this piece of work, both sentiment and layer ablation. This is really exciting stuff, and we are looking forward to see future work!

Thanks again!