ChatGPT Solving 1403 Math Konkour Problems
math equations are not properly formatted
Since ChatGPT-o1-preview
model has a limit of 30 messages per week, I asked only 10 (kinda random) questions (1/4 of all the math problems) and multiplied my results by 4.
GPT scored %100, in 134 seconds (so all 40 problems would take 134*40/60=8.9 minutes). However, at first the score was %80 due to a weird bug (?) causing GPT to think more than needed. I used ChatGPT-o1-mini
to see if it failed too or not. Surprisingly o1-mini
solved them correctly and did %100.
I used the same questions on the most capable non-reasoning model GPT-4o
and it scored %40 (actually %20 considering the penalty of wrong answers). Results here.