stormliucong / doc2hpo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

doc2hpo

doc2hpo is a java spring mvc based webapp to parse clinical note and get the HPO for phenolyzer analysis.

Demo website

RESTful API Example

import requests
import json

# for test purpose.
url = "https://impact.dbmi.columbia.edu/doc2hpo/version"
r = requests.post(url)
print(r.json())
# {u'ncbo': None, u'java': u'1.8.0_191', u'tomcat': u'8.5.35', u'doc2hpo': u'1.21.0', u'metamaplite': u'metamaplite-3.6.2rc3.jar', u'metamap': u'2016v2'}

# for string-based match. faster. 
url = "https://impact.dbmi.columbia.edu/doc2hpo/parse/acdat"
json = {
	"note": "He denies synophrys.",
    "negex": True
}
# headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url,json = json)
print(r.json())

# with negation detection enabled.
url = "https://impact.dbmi.columbia.edu/doc2hpo/parse/acdat"
json = {
	"note": "He denies synophrys.",
    "negex": True
}
# headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url,json = json)
print(r.json())

# using metamap lite. Much faster than Original Metamap.
url = "https://impact.dbmi.columbia.edu/doc2hpo/parse/metamaplite"
json = {
	"note": "He denies synophrys.",
    "negex": True
}
# headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url,json = json)
print(r.json())

# using ncbo annotator - recommended if single file is large.
url = "https://impact.dbmi.columbia.edu/doc2hpo/parse/ncbo"
json = {
	"note": "He denies synophrys.",
    "negex": True
}
# headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url,json = json)
print(r.json())

Deploy Doc2Hpo with Docker

  1. Clone the Doc2Hpo github repository
git clone https://github.com/stormliucong/doc2hpo.git
cd doc2hpo
  1. (Optional) Download MetaMapLite and UMLS if you don't have one. Please visit https://lhncbc.nlm.nih.gov/ii/tools/MetaMap/run-locally/MetaMapLite.html for details. and check the download.sh file for the version requirement After download, unzip files.
unzip public_mm_lite_3.6.2rc6_binaryonly.zip
unzip public_mm_data_lite_base_2020aa.zip
  1. Change config.properties
# setup the proxy you want to use. Put null if don't use
Proxy=null
# setup the proxy port you want to use. Put null if don't use
Port=null
# ncbo api url. use public one http://data.bioontology.org by default
NcboUrl=https://data.bioontology.org
# ncbo api key. use Cong's api in the public demo server. Input your own for internal server.
NcboApiKey=put-your-own-api-key-here
# dir for metamaplite setting. (No need to change, if you follow this instruction.)
metamapliteDataRoot=/code/public_mm_lite/data
  1. Build the COHD docker image
docker build -t doc2hpo .
  1. Run the COHD docker container (the second port mapping to 443 is only necessary if enabling HTTPS)
docker run -d -p [HOST:PORT]:8080 --name=doc2hpo-production doc2hpo

[DEPRECATED] Instructions for manually deploying Doc2Hpo

Step 0 : download everything you need

#### 0. install java if you don't have one
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
javac -version
#### 1. download apache tomcat if you don't have one
wget https://archive.apache.org/dist/tomcat/tomcat-8/v8.5.35/bin/apache-tomcat-8.5.35.tar.gz
#### 2. download apache maven if you don't have one
wget https://archive.apache.org/dist/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.tar.gz
#### 3. git clone this repository 
git clone https://github.com/stormliucong/doc2hpo.git

4. (Optional) download MetaMap and MetaMap Java API if you don't have one

please visit https://metamap.nlm.nih.gov/ to download MetaMap 2016v2 Linux Version and MetaMap Java API Release for Linux

5. (Optional) download MetaMapLite if you don't have one

please visit https://metamap.nlm.nih.gov/MetaMapLite.shtml to download MetaMapLite 2018 3.6.2rc3 with Category 0+4+9 (USAbase) 2018AA UMLS dataset

6. please put everything in one directory (Let's call it __myproject__ for now)

you should have apache-maven-3.6.0-bin.tar.gz, public_mm_linux_javaapi_2016v2.tar.bz2, public_mm_lite_3.6.2rc3.zip, apache-tomcat-8.5.35.tar.gz, doc2hpo/ now under myproject

Step 1: Installation of everything you need

cd myproject
#### 1. install tomcat
tar -xvf apache-tomcat-8.5.35.tar.gz
#### 2.install maven
tar -xzvf apache-maven-3.6.0-bin.tar.gz
#### 3.install MetaMap
bunzip2 -c public_mm_linux_main_2016v2.tar.bz2 | tar xvf -
cd public_mm/
./bin/install.sh
# press enter to use default settings #
#### 4.install MetaMap Java API
cd ../
bzip2 -dc public_mm_linux_javaapi_2016v2.tar.bz2 | tar xvf -
cd public_mm/
./bin/install.sh
# prese ender to use default settings #
# test java api #
./bin/testapi.sh breast cancer
#### 5.install MetaMapLite
cd ../
unzip public_mm_lite_3.6.2rc3.zip

Step 2: Configuration of Doc2Hpo

cd myproject
#### 1. copy mmp java api to lib
mkdir ./doc2hpo/src/main/webapp/WEB-INF/lib
cp ./public_mm/src/javaapi/target/metamap-api-2.0.jar ./doc2hpo/src/main/webapp/WEB-INF/lib
cp ./public_mm/src/javaapi/dist/prologbeans.jar ./doc2hpo/src/main/webapp/WEB-INF/lib
cp ./public_mm/src/javaapi/dist/MetaMapApi.jar ./doc2hpo/src/main/webapp/WEB-INF/lib
#### 2. copy mmlite jar to lib
cp ./public_mm_lite/target/metamaplite-3.6.2rc3.jar ./doc2hpo/src/main/webapp/WEB-INF/lib
cp ./public_mm_lite/lib/* ./doc2hpo/src/main/webapp/WEB-INF/lib
#### 3. change config file (if necessary)
##### Important. Make sure you set everything correctly.
##### Otherwise there is a 404 error and some engines in doc2hpo might not work properly

example configure file

# please change the name of this file to config.properties_bak.
# setup the proxy you want to use. Put null if don't use
Proxy=null
# setup the proxy port you want to use. Put null if don't use
Port=null
# ncbo api url. use public one http://data.bioontology.org by default
NcboUrl=https://data.bioontology.org
# ncbo api key. use Cong's api in the public demo server. Input your own for internal server.
NcboApiKey=xxxxxxxxxxx 
# dir for metamaplite setting
metamapliteDataRoot=/home/cl3720/public_mm_lite/data
cd ./doc2hpo/src/main/webapp/WEB-INF
# change the name of config.properties_bak to config.properties
mv config.properties_bak config.properties
vi config.properties
cd myproject
cd ./doc2hpo/properties
# change metamaplite db path accordingly
# you don't have to change by default.
vi metamaplite.properties

Step 3: Deploy of Doc2Hpo

cd myproject
#### 1. maven compile
cd doc2hpo/
../apache-maven-3.6.0/bin/mvn clean validate install
cp ./target/doc2hpo.war ../apache-tomcat-8.5.35/webapps/
#### 2. start MetaMap server (optional)
cd myproject
./public_mm/bin/skrmedpostctl start
./public_mm/bin/wsdserverctl start
nohup ./public_mm/bin/mmserver16 &
#### 3. start tomcat
cd myproject
./apache-tomcat-8.5.35/bin/startup.sh

Step 4: visit Doc2Hpo at localhost:8080/doc2hpo, and you are all set!

Step 5: For IMPACT2 server only.

change the ajax url to /doc2hpo/parse/acdat for Apache deligation (a better solution is required).

References

Download link

Documentation

  • You have to get a free UMLS license to install the software
  • Starting supporting servers and running the MetaMap server
  • Change MetamapBinPath in doc2hpo/src/main/webapp/WEB-INF/config.properties
  • Please change Api key for ncbo annotator in doc2hpo/src/main/webapp/WEB-INF/config.properties
  • Add proxy and port if necessary in doc2hpo/src/main/webapp/WEB-INF/config.properties
  • Proxy=null and Port=null if you don't need proxy.
  • export the doc2hpo.war file for the project. You could do it using eclipse or by maven
  • deploy the war file under apache-tomcat-8.5.35/webapps
  • Make sure privilege is correct.
  • start the tomcat and browse the results.
  • You could check version requirement by calling api at servername:8080/doc2hpo/version

Versioning

1.21.0

New features under development

Publications

Cong Liu, Fabricio Sampaio Peres Kury, Ziran Li, Casey Ta, Kai Wang, Chunhua Weng, Doc2Hpo: a web application for efficient and accurate HPO concept curation, Nucleic Acids Research, , gkz386, https://doi.org/10.1093/nar/gkz386

Authors

Cong Liu, Chi Yuan, Kai Wang, Chunhua Weng stormliucong@gmail.com

About


Languages

Language:Java 63.9%Language:JavaScript 23.0%Language:CSS 10.6%Language:Shell 1.4%Language:Python 0.9%Language:Dockerfile 0.2%