found two bugs that could cause your inferior performance than original paper

Question

found two bugs that could cause your inferior performance than original paper

jind11 opened this issue 7 years ago · comments

Hi, I have carefully read your code and found two bugs that could potentially cause your inferior performance compared with the original paper. 1. in the new_convolution function, there is missing use of self.tanh() as the activation after the convolution layer; 2. in the original paper, the convolution kernel is 1 since the input is already trigram, so there is no need to use kernel size of 3 in the new_convolution function. If you have doubt on my comments, welcome to discuss with me, thanks!

Dige Ai · Answer 1 · Mon Nov 27 2017 21:24:09 GMT+0800 (China Standard Time)

ohh.i think the conv function include tanh activation. it is an interesting understanding on kernel. i think you may right. i will reconsider the kernel implementation. thanks for your discussion. keep connection

Di Jin · Answer 2 · Mon Nov 27 2017 23:45:41 GMT+0800 (China Standard Time)

hi, thanks for the quick response. But I am sure the conv function at Pytorch does not include any activation function.

Dige Ai · Answer 3 · Tue Nov 28 2017 07:01:08 GMT+0800 (China Standard Time)

i had read paper second time and found that kernel should be 1. cnn should not comprise activation, i searched that. could we build some new connection? you helped a lot. thank you

Di Jin · Answer 4 · Wed Nov 29 2017 12:03:01 GMT+0800 (China Standard Time)

sure! my email is jindi15@mit.edu. And I also have some other revisions to your code, if you want, I can sent it to you for references.

Dige Ai · Answer 5 · Thu Nov 30 2017 09:49:34 GMT+0800 (China Standard Time)

copy that. so nice to you.

Di Jin · Answer 6 · Mon Dec 04 2017 04:05:27 GMT+0800 (China Standard Time)

hi, what is your email? I have some other questions for the algorithm understanding. Of if you are in US, we can have a phone call. My phone number is 617-710-6221

Dige Ai · Answer 7 · Mon Dec 04 2017 06:36:35 GMT+0800 (China Standard Time)

sorry, i had sent an email to you 4 days ago. maybe my gmail can not work in the wall. oh, i am in China by the way. the policy of vpn reduce our space further.😂😂😂
my working email is dgai_ruc@aliyun.com.

Di Jin · Answer 8 · Wed Dec 06 2017 07:04:55 GMT+0800 (China Standard Time)

hi, sent you an email to dgai_ruc@aliyun.com but did not get reply. my wechat is jindi930617, feel free to add me if you want

Diego Antognini · Answer 9 · Fri Dec 08 2017 21:18:41 GMT+0800 (China Standard Time)

Hi everybody,

I cannot reproduce the results at all (~25%). What were your final macro F1-score ?

Best

Dige Ai · Answer 10 · Thu Dec 21 2017 10:22:26 GMT+0800 (China Standard Time)

@Diego999 hi. maybe you can replace the paper

HT Liu · Answer 11 · Thu Dec 21 2017 12:46:54 GMT+0800 (China Standard Time)

@lawlietAi hi, I have sent you a email for the same confusion. Looking forward to your reply

Dige Ai · Answer 12 · Thu Dec 21 2017 13:28:59 GMT+0800 (China Standard Time)

@ShomyLiu thx for your discussion. i'd replied you minutes ago

Liu Zhihui · Answer 13 · Fri Jan 19 2018 13:11:01 GMT+0800 (China Standard Time)

Hi, any progress? What's your F1 score now?

Xiaobin · Answer 14 · Thu Sep 27 2018 12:12:09 GMT+0800 (China Standard Time)

Hi, did you reproduce the performance declared in the original paper? I implemented it with TF, but the performance is much lower.

Di Jin · Answer 15 · Thu Sep 27 2018 12:26:27 GMT+0800 (China Standard Time)

no, I can never replicate the performance in that paper. my performance is not good, around 82.7%

charlesfufu · Answer 16 · Mon Oct 08 2018 23:15:35 GMT+0800 (China Standard Time)

no, I can never replicate the performance in that paper. my performance is not good, around 82.7%

why my accuracy of model just is 25% ?

charlesfufu · Answer 17 · Mon Oct 08 2018 23:17:39 GMT+0800 (China Standard Time)

Hi everybody,

I cannot reproduce the results at all (~25%). What were your final macro F1-score ?

Best

Have you found the problem of accuracy with 25%?