cross-reference: find patients who have 1+ ECG in pre-event window AND 1+ ECG in post-event window

Question

cross-reference: find patients who have 1+ ECG in pre-event window AND 1+ ECG in post-event window

erikr opened this issue 4 years ago · comments

What
Enhance cross_reference to find patients who have 1+ ECG in a pre-event window, and 1+ ECG in a post-event window, e.g. find patients with "paired" data.

Why
We often are only interested in patients who have 1+ ECG prior to some event, as well as 1+ ECG after some event.

Examples:

initiation of immune checkpoint inhibitor therapy for cancer patients that can potentially damage the heart (ecg-ici, a project that we have not yet prioritized but would like to do in next few months)
aortic valve surgery that can potentially trigger or worsen arrhythmias (sts-afib project board)

How

New arguments --reference_start_time_tensor_paired and --reference_end_time_tensor_paired, would enable a user would call cross_reference to find ECGs from patients who have 1+ ECG prior to a surgery, as well as 1+ ECG after the surgery:

./scripts/tf.sh -c -t \
    ${HOME}/ml/ml4cvd/recipes.py \
    --mode cross_reference \
    --tensors_name ecg \
    --tensors /data/partners_ecg/mgh/explore/tensors_all_union.csv \
    --time_tensor partners_ecg_datetime \
    --reference_tensors /data/sts-afib/mgh-afib-after-avr-metadata.csv \
    --reference_name sts-afib-after-avr \
    --reference_join_tensors partners_ecg_patientid_clean \
    --reference_join_tensors mrn \
    --reference_start_time_tensor surgery_date -180 \
    --reference_end_time_tensor surgery_date \
    --reference_start_time_tensor_paired surgery_date \
    --reference_end_time_tensor_paired surgery_date + 180 \
    --output_folder $HOME \
    --id sts-afib-ecg-crossref-180-days-preop

Acceptance Criteria
Above command runs cross_reference to find patients who have 1+ ECG in pre-event window and 1+ ECG in post-event window, and quantify ECG coverage.

Steven Song · Answer 1 · Thu May 28 2020 21:51:37 GMT+0800 (China Standard Time)

This is really a desire to find cross referenced data in multiple time windows. Instead of only allowing 2, allow any number of time windows by specifying reference_start/end_time_tensor multiple times.

An additional augmentation will be to allow users to specify the number of data needed in each time window and which events in the time series to keep (newest/oldest/random)

arguments will probably look like this:

--mode cross_reference
--output_folder $HOME
--id sts-afib-ecg-crossref-180-days-preop

# Source Tensors
--tensors_name ecg
--tensors /data/partners_ecg/mgh/explore/tensors_all_union.csv
--join_tensor partners_ecg_patientid_clean
--time_tensor partners_ecg_datetime

# Reference Tensors
--reference_tensors /data/sts-afib/mgh-afib-after-avr-metadata.csv
--reference_name sts-afib-after-avr
--reference_join_tensors mrn

# Time Window 1
--reference_start_time_tensor  surgery_date -180
--reference_end_time_tensor    surgery_date
--number_in_window             1
--which_in_window              newest
--window_name                  pre-op

# Time Window 2
--reference_start_time_tensor  surgery_date
--reference_end_time_tensor    surgery_date  180
--number_in_window             1
--which_in_window              oldest
--window_name                  post-op

Output will likely change, details to follow during implementation

Erik Reinertsen · Answer 2 · Fri May 29 2020 04:23:08 GMT+0800 (China Standard Time)

Can you clarify what these args do?

--number_in_window             1
--which_in_window              newest

If they serve a key purpose, don't waste time explaining in a comment; better to just explain it in a docstring and point me to that line in the code :)

Steven Song · Answer 3 · Fri May 29 2020 23:02:31 GMT+0800 (China Standard Time)

Can you clarify what these args do?
--number_in_window             1
--which_in_window              newest

let's say for a patient 123, you had these data:

ecg 5/12
ecg 5/13
ecg 5/14
surgery 5/15
ecg 5/16
ecg 5/17
ecg 5/18

and you wanted to get the 1 newest pre-op ECG and the 2 oldest post-op ECG, so:

ecg 5/14
surgery 5/15
ecg 5/16
ecg 5/17

you can use args

# pre-op window
--window_name      pre-op
--number_in_window 1 
--which_in_window  newest

# post-op window
--window_name      post-op
--number_in_window 2
--which_in_window  oldest