You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work and code release. Currently I am working on saliency and notice that you propose a new saliency metric (Sec. 3.2) that s(a, p) = log(a) - log(p). However, when I evaluated on ImageNet with caffe pretrained GoogleNet, I got the following results
\
Saliency Metric (paper reported)
Saliency Metric (mine)
ground truth
0.284
0.3044
max box
1.366
1.3443
central box
0.645
0.7238
From max box result, I think my result conforms to your calculation. And ground truth result is close (the difference is because that I didn't calculate all the ground truth boxes). But the difference of central box result is quite large, and I cannot figure out what's going wrong. So could you please release saliency metric evaluation code? Thank you very much, and looking forward to your replay.
Best regards
Yulong
The text was updated successfully, but these errors were encountered:
Hmm, that's strange, but this could be explained by the differences in classifiers (are you sure you are using googlenet?) and the resizing strategy. The classifier has a very similar loss on both max box and central box (the -log(p) term). Obviously, the area is halved in case of the central box, so the area term will be log(1)=0 in case of the central box and log(0.5)=-0.7 in case of the central box. We would therefore expect the saliency metric for the central box to be about 0.7 smaller. In my case the classifier actually performed slightly better on central box than on max box, in your case it appears that the opposite was true. Anyway, the results look reasonable, make sure you are using the same net though.
The main implementation used for the paper was originally written in tensorflow and was a mess, so I reimplemented the main part in Pytorch. I could release the evaluation code as well if that helps.
I'm currently using Caffe pretrained googlenet, which is also called inception-v1. So do you also use inception-v1 model? I find tensorflow official implementation here, is that right?
I also want to confirm the preprocessing procedure for central box. First get the cropped area at the center with the size of (1/sqrt(2) * H, 1/sqrt(2) *W). Then resize the cropped image to 224 x 224, and do the normalization as before. Did I miss something? Thanks for your reply
Hi Piotr,
Thanks for your great work and code release. Currently I am working on saliency and notice that you propose a new saliency metric (Sec. 3.2) that s(a, p) = log(a) - log(p). However, when I evaluated on ImageNet with caffe pretrained GoogleNet, I got the following results
From max box result, I think my result conforms to your calculation. And ground truth result is close (the difference is because that I didn't calculate all the ground truth boxes). But the difference of central box result is quite large, and I cannot figure out what's going wrong. So could you please release saliency metric evaluation code? Thank you very much, and looking forward to your replay.
Best regards
Yulong
The text was updated successfully, but these errors were encountered: