Implementation of the SELECT parameter

Question

Implementation of the SELECT parameter

sngordon opened this issue 7 years ago · comments

I've found that when I request results an alternative format (ris, bibtex) the date returned tends to be the published-online date (perhaps the earliest of the published-online & published-print). The authors I'm working with tend to cite the published-print date, so I'm looking for strategies to provide the print date instead.

Examples:
http://api.crossref.org/works/10.1007/s10980-007-9188-1
ris date = published-online = 2007/12/22 (published-print = 2008/2)

http://api.crossref.org/works/10.1139/cjfr-2014-0148
ris date = published-print = 2015/2 (no published-online data)

One option would be to make an additional request to the API specifically for the published-print date using the SELECT parameter, and use it to replace the date in the RIS-formatted record. I don't think Habanero supports the SELECT parameter? I could perhaps modify the counts.py module to pick out the published-print info, but I see it uses a different url requiring a login, so I'm wondering if it might be better to try to implement the SELECT parameter in the cn.py code instead?

Another option would be to modify the crossref API's content negotiation / parsing code, but I don't see this available anywhere (I've only found this general reference: https://citation.crosscite.org/docs.html) (I realize this is more an issue for the crossref API forum).

Scott Chamberlain · Answer 1 · Mon Oct 16 2017 03:20:12 GMT+0800 (China Standard Time)

Thanks for the issue @sngordon

Hadn't seen the select parameter, will add it to habanero, (note to self, see also CrossRef/rest-api-doc#289)

As for your question about print vs. online dates, it doesn't appear that you can select those published date fields with select on that route /works/{DOI}. Not sure what best approach is. Not sure why you'd make a second request using select after the first request, which should have all the fields in it?

AFAIK there's no login required for the content negotation methods in habanero. The default URL is https://doi.org

Scott Chamberlain · Answer 2 · Mon Oct 16 2017 04:34:42 GMT+0800 (China Standard Time)

@sngordon you can reinstall from github and try again, select parameter is implemented

Scott Chamberlain · Answer 3 · Mon Oct 16 2017 12:33:36 GMT+0800 (China Standard Time)

closing this now as select param implemented

Sean Gordon · Answer 4 · Tue Oct 17 2017 06:23:17 GMT+0800 (China Standard Time)

@sckott thanks so much for the quick implementation of this! I found that I can now combine select with filter to get specific fields from a specific doi:
cr.works(filter = {'doi':"10.1007/s10980-007-9188-1"}, select = "DOI,published-print")

The reason I'm making this additional query for published-print is that our current workflow only requests a RIS-formatted record, which provides just a subset of the full crossref record. Might have to rethink this though. As an ecologist you might be tangentially interested in our bibliographic application: https://nwfp.taccimo.info and https://taccimo.info

The login I was referring to is in the counts.py module, which appears to use a different default url http://www.crossref.org/openurl/

Scott Chamberlain · Answer 5 · Tue Oct 17 2017 12:39:01 GMT+0800 (China Standard Time)

Great, glad that worked for you with select and filter

I see. Yes, counts has a email address for my collaborator. The idea is that module will no longer be needed some day as data are supposed to make it into the main crossref API, but who knows when that will be

Very cool about the tool you make. Is that using Crossref API in the backend then?

Sean Gordon · Answer 6 · Thu Oct 19 2017 08:47:42 GMT+0800 (China Standard Time)

The taccimo site is a MySQL database with a php front end. The references are actually hand selected or read in from key documents in an unstructured format. Then we use Crossref to attempt to retrieve a doi and a structured record (using RIS format now but may switch to citeproc-json so I can make sure to get the published-print date). Next I'm working on using the structured record to return a formatted citation in any style from the CSL. Unfortunately the style I created for this current project doesn't seem to be working, but I doubt this has anything to do with habanero:

from habanero import cn
cn.content_negotiation(ids = "10.1007/s10980-007-9188-1", format = "text", style = "usda-forest-service-pacific-northwest-research-station")`

Scott Chamberlain · Answer 7 · Fri Oct 20 2017 00:59:58 GMT+0800 (China Standard Time)

here's the docs page for content negotation in case you hadn't seen it https://citation.crosscite.org/docs.html

yeah, i don't know why that's not working. will see if i can find out

Scott Chamberlain · Answer 8 · Fri Oct 20 2017 08:36:39 GMT+0800 (China Standard Time)

so the CSL style just isn't updated where they are pulled from when doing content negotiation apparently, see the error in

curl -LH "Accept: text/x-bibliography; style=usda-forest-service-pacific-northwest-research-station" https://doi.org/10.1007/s10980-007-9188-1 | jq .

but folks at https://citation.crosscite.org/ did just update CSL styles so you can get a format there, or by curl like

curl -v 'https://citation.crosscite.org/format?doi=10.1007%2Fs10980-007-9188-1&style=usda-forest-service-pacific-northwest-research-station&lang=en-US'

but that of course is not in habanero

Sean Gordon · Answer 9 · Sun Oct 22 2017 09:04:58 GMT+0800 (China Standard Time)

Thanks again! I wasn't sure of where in the chain the problem was located. I didn't know about https://citation.crosscite.org/ either, but I'm sure I can rig some code to use that.