tpfister / caffe-heatmap

Caffe with heatmap regression & spatial fusion layers. Useful for any CNN image position regression task.

Home Page:http://www.robots.ox.ac.uk/~vgg/software/cnn_heatmap

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Euclidean loss heatmap layer error??

zhaishengfu opened this issue · comments

Hello. Thank you for sharing your code. But i find a big error. In you file of euclidean_loss_heatmap_layer.cpp, you have the following codes in line 81:
for (int idx_ch = 0; idx_ch < num_channels; idx_ch++)
{
for (int i = 0; i < label_height; i++)
{
for (int j = 0; j < label_width; j++)
{
int image_idx = idx_img * label_img_size + idx_ch * label_channel_size + i * label_height + j;
float diff = (float)bottom_pred[image_idx] - (float)gt_pred[image_idx];

but i think image_idx should be idx_img * label_img_size + idx_ch * label_channel_size + i * label_width+ j;
thatis , the label_height should be changed to label_width. Am i wrong??
Looking forward to your reply!

this is no an error , your mistake.

Sorry for mistaking it, but can you tell me more ?? Indeed i have changed your codes as i said and i can make it work correctly. In my opinion, variable i and j loop over label_height and label_width(for example, label_width=176 and label_height =216), then i * label_height will be 216*216!! Am I wrong about the meaning of variable i, j, label_width or label_height??

Hi @zhaishengfu I think you are right. However, it's weird that the original code works and no one raise this issue. Does your result of regression look good?

@liuxhy237 I think this is because they don't use the code to visualize??whatever, this only effects the visualization , so the code can work well. I think so...

@zhaishengfu I see. And label_height is set to be equal to label_width in most cases, I guess. BUT it's indeed a bug. Maybe you can make a PR.
And THANK YOU for raising this issue!

@zhaishengfu
Sorry to use your thread, but I have a related question.

In euclidean_loss_heatmap_layer.cpp there is code like this:

...
for (int i = 0; i < label_height; i++)
{
for (int j = 0; j < label_width; j++)
{
int image_idx = idx_img * label_img_size + idx_ch * label_channel_size + i * label_height + j;
float diff = (float)bottom_pred[image_idx] - (float)gt_pred[image_idx];
loss += diff * diff;
...

Is it possible to get (x,y) coordinates of prediction instead of an "area" ?

Thank you.

Oh it's nice to find @zhaishengfu's already raised this issue. I think, principally, he is right.
@rimphyd what you mean for the "area"? and isn't it right that x, y == j, i ? what you mean for the "x, y"?