All KG Edge source data is being mutated/removed from BTE's responses
colleenXu opened this issue · comments
EDITS:
- We've gotten confirmation that this is happening due to ARS-CI (Translator Slack link)
- All tools are likely affected: Aragorn and ARAX also seem to have this problem in my example here (looking at their sources info + responses' logs - and my screenshot below shows that they also have only 1 source)
I queried ARS CI with a "creative-mode treats" query (see TRAPI query below), and something happened to BTE's response that removes all of the KG edge source data and replaces it with BTE as the "primary knowledge source". I think it's happening after BTE returns its response to the ARS.
- PK for ARS:
3c4ee104-d8a6-46f3-95d1-85153ddf572b
. In ARAX UI, ARS CI - PK for BTE's response specifically:
eb5c4852-b801-40fe-bbcd-70598afc3146
. In ARAX UI, ARS - The async-job link https://bte.ci.transltr.io/v1/asyncquery_response/ruPFWCI12w didn't work, so I can't verify what BTE's original response looked like (before it was sent to the ARS). The link brings me to
{"error":"Response expired. Responses are kept 30 days."}
. Is this related to #763?
TRAPI query to ARS CI
{
"message": {
"query_graph": {
"nodes": {
"n0": {
"categories":["biolink:ChemicalEntity"]
},
"n1": {
"ids":["MONDO:0002251"],
"categories":["biolink:DiseaseOrPhenotypicFeature"],
"name": "hepatitis"
}
},
"edges": {
"e0": {
"subject": "n0",
"object": "n1",
"predicates": ["biolink:treats"],
"knowledge_type": "inferred"
}
}
}
}
}
Screenshot showing that all the KG edges have BTE as the primary knowledge source
The last log entry looks odd: it has a very different timestamp from the others and seems to note that the KG edge source data is being removed from all edges.
I wonder if this is related to NCATSTranslator/Feedback#628
If I run the same query in my local BTE instance, the KG edge sources info isn't missing. (I can get the full async response without any issues too!)
Responses
First: response from http://localhost:3000/v1/asyncquery
{
"status": "Accepted",
"description": "Async query queued",
"job_id": "uwpLGRVL2X",
"job_url": "http://localhost:3000/v1/asyncquery_status/uwpLGRVL2X"
}
Second: response from asyncquery_status endpoint asyncquery_status.txt
Third: response from asyncquery_response endpoint, that shows all the KG Edge source info intact asyncquery_response.txt
Setup:
- all main branches (should match BTE CI)
- run CI-specific smartapi sync with overrides
API_OVERRIDE=true INSTANCE_ENV=ci pnpm run smartapi_sync
- run local BTE in CI-mode, with redis for asyncquery
INSTANCE_ENV=ci pnpm start redis
- POST the TRAPI query from the opening post to
http://localhost:3000/v1/asyncquery
Investigation Update
EDIT: Only ARS-CI seems to be affected. All other instances seem to keep KG Edge source info intact
- ARS-prod
- ARS-test
- ARS-dev (which seems to be using BTE CI as well):
- PK for ARS:
49bd7a74-621a-4d62-947e-22519b5f0a74
. In ARAX-CI UI, ARS dev - PK for BTE's response specifically:
2f497608-57c3-45f9-b054-499243ccaca6
. In ARAX-CI UI, ARS-dev - The async-job link https://bte.ci.transltr.io/v1/asyncquery_response/n3qiSfynpS works right now! This is the downloaded response from this link: n3qiSfynpS.txt
- PK for ARS:
Update
I'm not seeing the "mutated/removed KG Edge source info" in ARS-CI responses now:
The other ARS instances also don't show this behavior (they didn't when I investigated last month too)
(I'm using ARAX-CI's UI, but I did submit all requests to the proper ARS instances)
The issue seemed to resolve before the ARS team did any investigation or code adjustment, which is unexpected (Translator Slack link). But because things seem resolved, I'm closing this issue.