Duplicate agents in properties table

Question

Duplicate agents in properties table

dhamaris opened this issue 3 years ago · comments

Hello,

I am not sure of whether this is a bug or a feature:
https://paperswithcode.com/sota/visual-question-answering-on-gqa-test2019

Is it ok that agent is repeated? I was assuming that I would only have 1 agent if all other fields (paperUrl, date, source, etc) are the same.

Thank you

dhamaris · Answer 1 · Wed Jul 07 2021 02:57:42 GMT+0800 (China Standard Time)

redundant_tuplas_by_metric_pwc.xlsx
I don't know if this will be of any use, but I identified the tuplas of experiments that contain more than one value when grouping by task, dataset, agent, paperDate, paperUrl and metric.

Elvis Saravia · Answer 2 · Wed Jul 07 2021 17:50:41 GMT+0800 (China Standard Time)

@dhamaris Thanks for opening this issue. Looking at the original source of where the results came from, it appears that the submissions are unique in the sense that they come from two different authors. Because we pull information from other sources we expect these scenarios to appear but they shouldn't be too common. The model names ideally should be unique.

Here is original source of the results you pointed out: https://eval.ai/web/challenges/challenge-page/225/leaderboard/733#leaderboardrank-18

dhamaris · Answer 3 · Wed Jul 07 2021 18:23:42 GMT+0800 (China Standard Time)

@omarsar Thank you for your swift response, we normalized the evaluation-tables.json into a DB with each tupla found in the tree.
The problem with that is that I lost the information to know whether an experiments belongs to a specific contribution or not, it is all mixed up together:

I tried to group by paperDate and paperUrl, but that does not solve the problem when they are null or when they match.
So I am considering adding a new field to my model called something like contributionId that identifies an unique SOTARow object, but I am still not sure. What is your take on this? Are these 2 experiments the same, but added by 2 people? Or are these 2 experiments 2 actual items? Why would you say that these scenarios shouldn't be too common? I am trying to understand this table. Is it uncommon that people report experiments by the same agent but providing multiple results?

And if the modelNames should be unique, does it mean that these redundant information will be at some point corrected?

Elvis Saravia · Answer 4 · Wed Jul 07 2021 23:08:19 GMT+0800 (China Standard Time)

I am not sure how to fix what you are trying to achieve. Not sure about it, but I am inclined to think that the results correspond two different experiments by two different users. Maybe they use similar model, backbone, experimental setup, or even same code base, as it is common in public community leaderboards such as Kaggle. If the results are obtained from external leaderboards like in this case, it is possible we will see this type of results. If the results are added on our website directly, it is less common as results are added directly through papers, helping to preserve one model to one result relationship.

dhamaris · Answer 5 · Wed Aug 04 2021 17:26:07 GMT+0800 (China Standard Time)

Thank you, we will keep the relationship 1 to 1 as well