IQSS / dataverse-client-r

R Client for Dataverse Repositories

Home Page:https://iqss.github.io/dataverse-client-r

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

inconsistent order of dataset files

wibeasley opened this issue · comments

@pdurbin, has there been a recent change to how dataset files are returned? Last night, @kuriwaki had to modify the testing code to adapt to the order.

Curiously we have trouble only with the NLSY dataset, not the basketball dataset. Maybe that's because the first had two real data files, whereas the second has a dataset and an svg?

We're rushing to resubmit to CRAN so a copy is available for Shiro's presentation tomorrow. This order caused the package to be rejected last night (#84, #83).

  • I think it is an issue with the order of the datasets presented in a dataverse object, not so much the files within the dataset: That is, sometimes the NLSY dataset would be the first item and the basketball dataset would be the second, other times vice versa.
  • Perhaps there is just some randomness on which gets loaded first? We had a CRAN test suite that tripped up in debian but not the other OS

Thanks for straightening me out, @kuriwaki. I looked through the R-hub logs and found one instance of the basketball dataset file was expected and I guess the NLSY was returned.

https://builder.r-hub.io/status/dataverse_0.3.1.tar.gz-7809c4af0c504a5dbc7994623c098dc1

I think you're talking about the "contents" API and I don't believe there's ever been an "order by" to make the results deterministic. The code looks like this:

@NamedQuery(name = "DvObject.findByOwnerId", 
            query = "SELECT o FROM DvObject o WHERE o.owner.id=:ownerId"),

(This is called, ultimately, from ListDataverseContentCommand.)

If you'd like the "contents" API to return results in a particular order, please feel free to open an issue about it.

I hope this helps. I'm sorry to hear about the test suite troubles.

We got around this in our tests randomness by specifying the item in contents.