Tian312 / MD-Attention

MD-informed Self-Attention: a neuro-symbolic model for machine reading comprehension of clinical research literature.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MD-informed Self-Attention :octocat:

We introduce Medical Evidence Dependency(MD)-informed Self-Attention, a Neuro-Symbolic Model for understanding free-text medical evidence in literature. We hypothesize this method can get the best of both: the high capacity of neural networks and the rigor, semantic clarity and reusability of symbolic logic.

Repository

MDAtt.py generate Medical Evidence Dependency-information attention head.
MED_modeling.py: modified from bert/modeling.py (attention_layer, transformer and bert classes).
run_MDAttBert.py: run BERT with Medical Evidence Dependency-information attention head.

Model description

Model We develop a symbolic compositional representation called Medical evidence Dependency (MD) to represent the basic medical evidence entities and relations following the PICO framework widely adopted among clinicians for searching evidence. We use Transformer as the backbone and train one head in the Multi-Head Self-Attention to attend to MD and to pass linguistic and domain knowledge onto later layers (MD-informed). We integrated MD-informed Attention into BioBERT and evaluated it on two public MRC benchmarks for medical evidence from literature: i.e., Evidence Inference 2.0 and PubMedQA.

Medical Evidence Dependency (MD) and Proposition
MEP

Medical Evidence Dependency (MD) Matrix
MEP

Medical Evidence Dependency (MD)-informed Self Attention
MEP

Results The integration of MD-informed Attention head improves BioBERT substantially for both benchmarks—as large as by +30% in the F1 score—and achieves the new state-of-the-art performance on the Evidence Inference 2.0. By visualizing the weights learned from MD-informed Attention head, we find the model can capture clinically meaningful relations separated by long passages of text.

About

MD-informed Self-Attention: a neuro-symbolic model for machine reading comprehension of clinical research literature.


Languages

Language:Python 100.0%