kevinduh / san_mrc

Stochastic Answer Networks (SAN) for Machine Reading Comprehension

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upload logs for Squad 2.0

LearningPytorch opened this issue · comments

Hi
Thanks for new code..they're great. Can you also upload the logs (san.log) for squad 2.0? I want to make sure that I'm getting similar scores like you. thank you again.

Specially I want to check the vocab size since you changed the prepro.py:
Raw vocab size vs vocab in glove: 106415/90949
OOV rate:1.2000=262509/21875454
final vocab size: 90953

Sure. I'll release it soon.

Thanks a lot! I'll wait for it. It'll be very useful to compare with your performance on Squad 2.0.
E.g. What is your performance (EM, F1) on Squad dev 2.0?

We got 69.x/72.x on dev in terms EM/F1. We're writing a tech report about our model/experiments and will publish soon.

Wow! That's much higher than what I got when I ran this package: best EM: 62.x F1: 66.x
Did you see anything unusual with my vocab size which I uploaded above? I'm not sure why my performance is ~6 points lower than yours.
I was able to get almost same numbers (as your reported) on 1.1 by running your system.

@namisan is there any way you can upload the updated code soon? Your code is good and I get to learn a lot about Squad 2. I'm working on a course project with some of your code. I saw that you're going on a vacation on the other open issue. Hope you upload before that. Thanks

I had a similar EM: 62.x and F1: 66.x results, maybe something is wrong.

Hi @hackiey. Thanks for confirming that you got same/ similar results as me. The package gets similar results reported in the readme for Squad 1.1 but not for 2.0. @namisan @kevinduh maybe we're doing something wrong?

The current config is for v1.1, not for 2.0. As the attached tech report, using a lower dropout rate, e.g., 0.1, and larger hidden size (300) could lead a better result. Hope this helps. I'm currently on vacation and will checkin the logs or models once I'm back.

Hope you will upload all the code that gives you the performance gains ..that would be very useful @namisan. Have a good vacation.

Are you back @namisan ?

@namisan could U update your hyper-params for squad-2.0, I have tried dropout & hidden-size, the highest F1 reached 69.3.

@hackerwei could you please elaborate which params did you change? there are so many drop-out params and hidden size variables. It would be great if you upload your config.py file. Thank you.

@namisan we are also waiting for you too since your params will let us get the numbers reported in the tech report.

I released the worksheets of official submissions. I close this.