- Unzip the contents from the "prodigy-1.11.6-linux.zip" file in this project wheel directory.
- Using the terminal, navigate to this repository dir, and then execute the following command to build the docker image:
docker build . -t prodigy
- Using the terminal, navigate to this repository dir, and then execute the following command to initiate a Docker container to use Prodigy.
docker run -it --user $(id -u):$(id -g) -p 8080:8080 -v "$(pwd)"/outputs:/outputs -v "$(pwd)"/prodigy:/prodigy_home -v "$(pwd)"/datasets:/datasets prodigy bash
After initiating the container, you will be able to run Prodigy commands like prodigy stats
. More details about the commands to use in https://prodi.gy/docs
- To finish the container, press
Ctrl+d
.
This repository is organized with the following directories.
- datasets: In this directory, you should store the prodigy jsonl files (inputs and outputs).
- prodigy: This directory is the PRODIGY_HOME dir. It stores the tool configuration and the *.db files where the annotations are stored.
- wheel: In this directory, you should store the wheel files for installation.
prodigy textcat.manual binary_report_classification_task /datasets/dataset.jsonl --label ABNORMAL,STOCK --exclusive
prodigy textcat.manual report_classification_task /datasets/dataset.jsonl --label ABNORMAL,STOCK,DEFACTO,CONTEXTUAL,INCIDENTAL --exclusive
prodigy textcat.manual comparative_task /datasets/dataset.jsonl --label COMPARATIVE,NONCOMPARATIVE --exclusive
prodigy textcat.manual domain_review /datasets/domain-review-dataset.jsonl --label pathology-cerebrovascular,pathology-csf-disorders,pathology-endocrine,pathology-infectious,pathology-neoplastic-paraneoplastic,pathology-inflammatory-autoimmune,pathology-metabolic-nutritional-toxic,pathology-neurodegenerative-dementia,pathology-traumatic,pathology-musculoskeletal,pathology-treatment,pathology-congenital-developmental,pathology-opthalmalogical,pathology-ischaemic,pathology-haemorrhagic,pathology-vascular
prodigy db-out binary_report_classification_task > /datasets/binary_report_classification_task_labelled.jsonl