CCF validation invalids picked up
shawntanzk opened this issue · comments
@dosumis I'm using this ticket for examples that are picked up that I think are good: (still working on it, I've just gotten a version with labels for the aba stuff, so might do a better job here)
- Uncurated entities get mapped to regional part of brain
Currently the most amount of non_valids are coming from terms that are mapped to regional part of brain
Regional part of brain cannot be part of a smaller unit. example:
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0007225 | UBERON:0002616 | lateral entorhinal cortex | regional part of brain | http://purl.obolibrary.org/obo/MBA_918 | http://purl.obolibrary.org/obo/MBA_312 |
-
Developmental structure as part_of any part of the brain
Example: mesomere 1 part_of midbrain (also part_of regional part of brain)
In Uberon developmental parts are subclassof future brain rather than brain -
Looks like a good one but I need to do more research (leave this with me, I'll figure this out):
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002714 | UBERON:0002615 | rubrospinal tract | ventral tegmental decussation | http://purl.obolibrary.org/obo/MBA_863 | http://purl.obolibrary.org/obo/MBA_397 |
- Should be overlap instead of part_of - esp prominent in spinal cord stuff
Examples:
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002616 | UBERON:0002707 | regional part of brain | corticospinal tract | http://purl.obolibrary.org/obo/MBA_784 | http://purl.obolibrary.org/obo/MBA_190 |
UBERON:0002272 | UBERON:0001935 | medial zone of hypothalamus | ventromedial nucleus of hypothalamus | http://purl.obolibrary.org/obo/MBA_467 | http://purl.obolibrary.org/obo/MBA_693 |
UBERON:0002616 | UBERON:0002291 | regional part of brain | central canal of spinal cord | http://purl.obolibrary.org/obo/MBA_73 | http://purl.obolibrary.org/obo/MBA_164 |
6a. Should be (I THINK) connected_to instead of part_of (will make sure this is not overlaps too, but I'm guessing lateral and posterior would be separate
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002736 | UBERON:0002709 | lateral nuclear group of thalamus | posterior nuclear complex of thalamus | http://purl.obolibrary.org/obo/MBA_138 | http://purl.obolibrary.org/obo/MBA_1020 |
6b. Things that are not in the brain (I'm guessing its an issue of connected_to also)
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002616 | UBERON:0001646 | regional part of brain | abducens nerve | http://purl.obolibrary.org/obo/MBA_967 | http://purl.obolibrary.org/obo/MBA_710 |
UBERON:0002616 | UBERON:0001647 | regional part of brain | facial nerve | http://purl.obolibrary.org/obo/MBA_967 | http://purl.obolibrary.org/obo/MBA_798 |
-
CCF validation showing invalid due to uberon not having relation:
MAYBE
-- | -- | -- | -- | -- | --
lateral zone of hypothalamus | zona incerta | http://purl.obolibrary.org/obo/MBA_290 | Hypothalamic lateral zone | http://purl.obolibrary.org/obo/MBA_797 | Zona incerta
Is there a quick summary of how to read these?
For the first one it seems there is an implicit rdfs:subClassOf between s and o? And 'user' is a user's mapping?
Regional part of brain cannot be part of a smaller unit
Which OWL axiom is this?
Is the root of this issue:
id: UBERON:0007225
name: lateral entorhinal cortex
...
is_a: UBERON:0000064 ! organ part
Which is a little odd - some regions are "organ part" some are "regional part of brain". I think we could do a search and replace on these just for structural consistency
Is there a quick summary of how to read these?
o is object and s is subject - so it looks for s part_of o (or subclass of, overlaps, connected to, develops from)
user_olabel/user_slabel is the mba mapping to the uberon term
In general I wouldn't bother looking too much into this for now - we have a new list of expert curated bridge that I'm working on to update the bridge file. After which we plan to re-run the CCF validation on that, so hopefully all these will be gone after that :)
I'm also thinking of running boomer on Uberon + ABAs which may overlap with this QC check
Is there a quick summary of how to read these?
The headers were designed for reporting on ASCT+B tables user_slabel and user_olabel = the names given to subject and object of the relationship in the tables. In our repurposing, these correspond to child and parent term in the structureGraph. I think these would be easier to read if the user_olabel and user_slabel had the form 'label ; id'
Note - invalid
is an overstatement. All the reports test is for whether child --> parent in the structureGraph corresponds to a 'relationship' (asserted or inferred) between the Uberon mapping of child and the Uberon mapping of parent where that relationship might be subClassOf or an existential restriction on part_of, overlaps or connected_to. Maybe non-validated would be a more accurate term to use.
Working through some examples (just documenting - no need to reply)
Looks like this might be a genuine difference in definition and composition of zones between ABA and Uberon
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002272 | UBERON:0001935 | medial zone of hypothalamus | ventromedial nucleus of hypothalamus | http://purl.obolibrary.org/obo/MBA_467 | http://purl.obolibrary.org/obo/MBA_693 |
This one could also be a genuine difference in how nuclei are grouped, this time by partitioning of the thalamus (or it could be that Uberon just didn't partition this nucleus.
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002736 | UBERON:0002709 | lateral nuclear group of thalamus | posterior nuclear complex of thalamus | http://purl.obolibrary.org/obo/MBA_138 | http://purl.obolibrary.org/obo/MBA_1020 |
This one might be overlap:
o | s | olabel | slabel | user_olabel | user_slabel |
---|---|---|---|---|---|
UBERON:0002714 | UBERON:0002615 | rubrospinal tract | ventral tegmental decussation | http://purl.obolibrary.org/obo/MBA_863 | http://purl.obolibrary.org/obo/MBA_397 |