biothings / biothings_explorer

TRAPI service for BioThings Explorer

Home Page:https://api.bte.ncats.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

adjust Node Norm urls and options?

colleenXu opened this issue · comments

[UPDATED to add info from Chris Bizon (Translator Slack link)]

In the previous update #731, the urls for each NodeNorm instance were hard-coded, and I think CI / Test / Prod use "1.3" version urls.

We should make some changes:

  1. use urls without the version, like https://nodenorm.transltr.io/get_normalized_nodes. This should always access the latest version of Node-Norm
    • then we don't have to update the urls whenever they change versions
    • the current, latest NodeNorm is version "1.4"
  2. change our requests to set the conflate (gene/protein conflation) and drug_chemical_conflate options to true (see docs). (here's my previous notes #731 (comment))
    • It's certain that ARAs should do gene/protein conflation
    • It's not clear if ARAs should do drug/chem conflation, but Chris Bizon agrees that it makes sense to do

Note on NodeNorm SmartAPI registrations:

  • It's kinda tricky to use the SmartAPI registrations rather than hard-coded request urls.
    • NodeNorm doesn't seem to have a stable registration for its service. Its team seems to create new registrations for each new version/release of NodeNorm (rather than updating the existing registration)
    • A new NodeNorm release/registration also may not have all the server maturities (since it'll roll out over time).
  • Right now, there's a "1.3" and "1.4" registration for NodeNorm in the SmartAPI registry), but they said they'll remove the "1.3" registration

And a note for now, which may become its own to-do / issue...

I noticed that even when I set conflate=true, ENSP IDs like this seem to normalize as Protein and not Gene.

If this is intended behavior, I may want to review the Gene/Protein namespaces in x-bte annotation to the NodeNorm behavior to the operation predicate to see if they all line up. If they don't, I may need to set the operation to Protein and adjust the predicate to a non-Gene-specific one...

I'm thinking specifically of the BioThings DISEASES operations which is where I discovered this situation: Disease condition-associated-with-gene Gene, but the Gene is using Ensembl ENSP IDs which NodeNorm is saying are Proteins. I then don't know if the predicate makes sense.

It's not clear if this needs to be a hot-fix all the way to Prod, VS it can go along with Translator's Lobster release schedule. Either way, it is a high-ish priority to fix this.

It depends on whether the hard-coded 1.3 urls we are using for CI / Test / Prod are going to be deprecated / unavailable soon or not. I've asked in Translator Slack here

Update: this doesn't need to be a hot-fix, it can go along with Lobster's release schedule.

The other team can keep the 1.3 urls for Lobster release (Translator Slack link). These re-reroute to 1.4 under the hood right now, which is fine for us

Deployed to Prod.