danlou / MedLinker-Social

UMLS Medical Entity Linking. Adaptation of MedLinker (ECIR 2020) for the Social Domain.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MedLinker-Social

Adaptation of MedLinker (ECIR 2020) to the Social Domain.

Project Homepage: http://danlou.github.io/medlinker-social/

Adaptation Report: http://danlou.github.io/files/papers/medlinker_social_report.pdf

Installation

Prepare Environment

This project was developed on Python 3.6.5 from Anaconda distribution v4.6.14. As such, the pip requirements assume you already have packages that are included with Anaconda (numpy, etc.). After cloning the repository, we recommend creating and activating a new environment to avoid any conflicts with existing installations in your system:

$ git clone https://github.com/danlou/MedLinker-Social.git
$ cd MedLinker-Social
$ conda create -n MedLinker-Social python=3.6.5
$ conda activate MedLinker-Social
# $ conda deactivate  # to exit environment when done with project

Additional Packages

To install additional packages used by this project (YAKE and SimString) run:

$ pip install -r requirements.txt

If you want to load fasttext .bin models (as used in the tutorial), you'll also need to follow these instructions to install fasttext for Python.

Download Data and Models

To use our linker, you'll need to download and unzip umls_2020_aa_cat0129_ext.zip into data/SimString. The linker uses additional files of smaller sizes, but these should already be included in this repository.

We also release a version of the linker without the aliases we added to UMLS. If you're interested in that version of the linker, then download and unzip umls_2020_aa_cat0129.zip instead, also into data/SimString.

In case you're just interested in UMLS data, we recommend using scispacy's version, which we also used for this project.

You may also find corpora and embeddings we use in the tutorial and report at the reddit and europmc folders.

Using MedLinker-Social

Follow the tutorial at: https://github.com/danlou/MedLinker-Social/blob/main/tutorial.ipynb

About

UMLS Medical Entity Linking. Adaptation of MedLinker (ECIR 2020) for the Social Domain.

License:Apache License 2.0


Languages

Language:Jupyter Notebook 63.6%Language:Python 36.4%