Minor discrepancies between subfolder csvs and master sheet
ArthurSpirling opened this issue · comments
Hello @vincentarelbundock -- thanks so much for providing these data.
I did a very quick scan through the data and documentation for the same. In particular, I was looking for any discrepancies between this main sheet and the names of the data sets themselves (as in name
.csv) stored in the subfolders.
Here are some that are found that appear in the data as csvs, but not documented on the sheet. This was very rough and ready, and I might have missed something, but just in case it's helpful for your sweeps --
"aldh2" "apoeapoc" "bomregions2011" "bomregions2012"
"bomsoi2001" "cf" "cnv" "crohn"
"Damian" "fa" "fsnps"
"head.injury" "hla" "inf1"
"jma.cojo" "l51" "lukas" "mao"
"meyer" "mfblong" "mr" "nep499"
"PD"
For example, bomregions2012.csv
appears in the DAAG
subfolder, but not on that master sheet. And indeed, it has documentation here.
Again, thanks for all this work!
Ah, also, there's an entry for hdma
and hmda
both from Ecdat
and both seemingly identical descriptions (?) and docs.
Update: DAAG
contains both a head.injury.csv
and a headInjury.csv
--- which may be identical? not sure.
Thanks for the report. Glad the website is useful!
I looked at a few of these and my best guess is this:
- My script never calls
git rm
on anything, so datasets stay there forever. This is important in case someone links to the URL in one of their scripts. - However, the main sheet index is created every time I run the script, and that's based on what is currently available in the packages. I think that also makes sense: If a package maintainer removes a dataset, I may still want to keep permanent links to protect users, but it's probably "polite" to not advertise the dataset anymore.
The few datasets I checked didn't seem to be available in their packages anymore. And in the head.injury
case, the DAAG
changelog says it was a duplicate and was removed:
https://github.com/cran/DAAG/blob/master/NEWS#L27
Again, I didn't check them all, but my provisional conclusion is that things are probably fine as-is. Makes sense?
Sounds good, thanks very much.