adjust Node Norm urls and options?
colleenXu opened this issue · comments
[UPDATED to add info from Chris Bizon (Translator Slack link)]
In the previous update #731, the urls for each NodeNorm instance were hard-coded, and I think CI / Test / Prod use "1.3" version urls.
We should make some changes:
- use urls without the version, like
https://nodenorm.transltr.io/get_normalized_nodes
. This should always access the latest version of Node-Norm- then we don't have to update the urls whenever they change versions
- the current, latest NodeNorm is version "1.4"
- change our requests to set the
conflate
(gene/protein conflation) anddrug_chemical_conflate
options to true (see docs). (here's my previous notes #731 (comment))- It's certain that ARAs should do gene/protein conflation
- It's not clear if ARAs should do drug/chem conflation, but Chris Bizon agrees that it makes sense to do
Note on NodeNorm SmartAPI registrations:
- It's kinda tricky to use the SmartAPI registrations rather than hard-coded request urls.
- NodeNorm doesn't seem to have a stable registration for its service. Its team seems to create new registrations for each new version/release of NodeNorm (rather than updating the existing registration)
- A new NodeNorm release/registration also may not have all the server maturities (since it'll roll out over time).
- Right now, there's a "1.3" and "1.4" registration for NodeNorm in the SmartAPI registry), but they said they'll remove the "1.3" registration
And a note for now, which may become its own to-do / issue...
I noticed that even when I set conflate=true, ENSP IDs like this seem to normalize as Protein and not Gene.
If this is intended behavior, I may want to review the Gene/Protein namespaces in x-bte annotation to the NodeNorm behavior to the operation predicate to see if they all line up. If they don't, I may need to set the operation to Protein and adjust the predicate to a non-Gene-specific one...
I'm thinking specifically of the BioThings DISEASES operations which is where I discovered this situation: Disease condition-associated-with-gene Gene, but the Gene is using Ensembl ENSP IDs which NodeNorm is saying are Proteins. I then don't know if the predicate makes sense.
It's not clear if this needs to be a hot-fix all the way to Prod, VS it can go along with Translator's Lobster release schedule. Either way, it is a high-ish priority to fix this.
It depends on whether the hard-coded 1.3 urls we are using for CI / Test / Prod are going to be deprecated / unavailable soon or not. I've asked in Translator Slack here
Update: this doesn't need to be a hot-fix, it can go along with Lobster's release schedule.
The other team can keep the 1.3 urls for Lobster release (Translator Slack link). These re-reroute to 1.4 under the hood right now, which is fine for us
Deployed to Prod.