Classification throws RuntimeError for real nodule location

Question

Classification throws RuntimeError for real nodule location

WGierke opened this issue 7 years ago · comments

Related to #229 : While the nodule classification works for providing nodule locations with small values in each dimension, it fails when I tried to classify a real nodule (first one of patient LIDC-IDRI-0003).

Expected Behavior

A cancer probability should be returned.

Current Behavior

An error is thrown:

(venv3.5) ➜  concept-to-clinic git:(master) ✗ docker exec -it concepttoclinic_prediction_run_259  pytest -vrsk src/tests/test_classification.py
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.6.1, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 4 items 

src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom PASSED
src/tests/test_classification.py::test_classify_dicom_nodule FAILED
src/tests/test_classification.py::test_classify_luna PASSED

================================================================================================= FAILURES =================================================================================================
________________________________________________________________________________________ test_classify_dicom_nodule ________________________________________________________________________________________

dicom_path_003 = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def test_classify_dicom_nodule(dicom_path_003, model_path):
>       predicted = trained_model.predict(dicom_path_003, [{'x': 367, 'y': 350, 'z': 72}], model_path)

src/tests/test_classification.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/algorithms/classify/trained_model.py:40: in predict
    return gtr123_model.predict(dicom_path, centroids, model_path)
src/algorithms/classify/src/gtr123_model.py:275: in predict
    _, pred, _ = casenet(cropped_image, coords)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py:224: in __call__
    result = self.forward(*input, **kwargs)
src/algorithms/classify/src/gtr123_model.py:215: in forward
    noduleFeat, nodulePred = self.NoduleNet(xlist, coordlist)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py:224: in __call__
    result = self.forward(*input, **kwargs)
src/algorithms/classify/src/gtr123_model.py:175: in forward
    feat = self.back2(torch.cat((rev2, out2, coord), 1))  # 64+64
/usr/local/lib/python3.6/dist-packages/torch/autograd/variable.py:897: in cat
    return Concat.apply(dim, *iterable)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ctx = <torch.autograd.function.ConcatBackward object at 0x7f6c231bbb88>, dim = 1
inputs = (
(0 ,0 ,0 ,.,.) = 
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.8768  0.6747  1.5596  ...   0.5966  1.1....4682  0.4782  0.4881
  0.2595  0.2695  0.2794  ...   0.4682  0.4782  0.4881
[torch.FloatTensor of size 1x3x24x24x24]
)

    @staticmethod
    def forward(ctx, dim, *inputs):
        ctx.dim = dim
        ctx.input_sizes = [i.size(dim) for i in inputs]
>       return torch.cat(inputs, dim)
E       RuntimeError: inconsistent tensor sizes at /pytorch/torch/lib/TH/generic/THTensorMath.c:2709

/usr/local/lib/python3.6/dist-packages/torch/autograd/_functions/tensor.py:317: RuntimeError
------------------------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------------------------
torch.Size([1, 1, 52, 96, 96])
=================================================================================== 1 failed, 3 passed in 16.99 seconds ====================================================================================

See here to verify that at this position there is a nodule:

Steps to Reproduce

read #268 (comment) first

patch.txt

git checkout 42ba1b121504ad318210463d422f5ca2947cbacb
git apply patch.txt
docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py

Possible Solution

I guess it has something to do with the input size the gtr123 model expects.

Checklist before submitting

I have confirmed this using the officially supported Docker Compose setup using the local.yml configuration and ensured that I built the containers again and they reflect the most recent version of the project at the HEAD commit on the master branch
I have searched through the other currently open issues and am confident this is not a duplicate of an existing bug
I provided a minimal code snippet or list of steps that reproduces the bug.
I provided screenshots where appropriate
I filled out all the relevant sections of this template

Reuben Cummings commented 7 years ago

confirmed

Serhiy-Shekhovtsov · Answer 1 · Sat Dec 30 2017 05:30:56 GMT+0800 (China Standard Time)

I did some checking... I may be wrong, but it looks like the specified slice is not valid.
The one we are running the test for is {'x': 367, 'y': 349, 'z': 72} but the DICOM at this slice looks like a mess:

While when I run the test on {'x': 367, 'y': 349, 'z': 189} it doesn't fail. Here is how the image looks at specified slice:

@WGierke, @reubano what do you think?

Willi Gierke · Answer 2 · Sat Dec 30 2017 06:13:04 GMT+0800 (China Standard Time)

Hi Serhiy, can you check the slice indices, please? The z axis is the number of the slice starting at 0. This is because the image position does not always start at a fixed value. I'm taking this from the design doc. Also, the used Python library for the LIDC images interprets it like that. Thanks!

…

On Dec 29, 2017, 10:31 PM, at 10:31 PM, Serhiy-Shekhovtsov ***@***.***> wrote: I did some checking... I may be wrong, but it looks like the specified slice is not valid. The one we are running the test for is `{'x': 367, 'y': 349, 'z': 72}` but the DICOM at this slice looks like a mess: ![nodule1](https://image.prntscr.com/image/BSyb534nQ1CKV30UU8hdsg.png) While when I run the test on `{'x': 367, 'y': 349, 'z': 189}` it doesn't fail. Here is how the image looks at specified slice: ![nodule1](https://image.prntscr.com/image/b-VYmjN2Tra4JL_fAZpj9g.png) @WGierke, @reubano what do you think? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #268 (comment)

Serhiy-Shekhovtsov · Answer 3 · Sat Dec 30 2017 06:23:02 GMT+0800 (China Standard Time)

I knew I missed something. Will keep checking this issue.

Serhiy-Shekhovtsov · Answer 4 · Tue Jan 02 2018 23:55:05 GMT+0800 (China Standard Time)

The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

While troubleshooting this issue I have found an other problem - the preprocessing is zooming the image. But coordinates we are using for prediction are not scaled according to the zoom and the actual patch we are making prediction on differs from the expected patch.
For example, the zooming factor for full LIDC-IDRI-0003 image is [2.5, 0.820312, 0.820312]. The prediction result for coordinates of the real nodule {'x': 367, 'y': 350, 'z': 72} is 0.0066. As you can see, it's very low. But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.

I am looking for a good way to fix it, but if you have some ideas - feel free to share :)

Serhiy-Shekhovtsov · Answer 5 · Wed Jan 03 2018 00:13:53 GMT+0800 (China Standard Time)

Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing?
@WGierke, @reubano

Serhiy-Shekhovtsov · Answer 6 · Wed Jan 03 2018 01:39:48 GMT+0800 (China Standard Time)

@vessemer , I think you may also be interested in this discussion.

Reuben Cummings · Answer 7 · Wed Jan 03 2018 16:39:57 GMT+0800 (China Standard Time)

The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

Is there instead, another coordinate that works for both large and small images?

Serhiy-Shekhovtsov · Answer 8 · Wed Jan 03 2018 16:42:06 GMT+0800 (China Standard Time)

Is there instead, another coordinate that works for both large and small images?

Yes, there is. Don't have it at my hand right now, but will post within an hour. Actually, anything between 0 and 28 should be fine.

Reuben Cummings · Answer 9 · Wed Jan 03 2018 16:46:18 GMT+0800 (China Standard Time)

Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing?

Hmm, self.spacing is used by two different classes. If they are infact referencing the same thing, we should rethink how the code is organized. Maybe having MetaData also subclass Params?

Reuben Cummings · Answer 10 · Wed Jan 03 2018 16:48:46 GMT+0800 (China Standard Time)

But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.

My suggestion would be to wrap whatever method you used to calculate those new coordinates into a function. Then add that new function into the conditional so that after zooming, it recalculates the coords.

Serhiy-Shekhovtsov · Answer 11 · Thu Jan 04 2018 05:37:15 GMT+0800 (China Standard Time)

@reubano, @WGierke something like this?

Anton · Answer 12 · Thu Jan 04 2018 21:33:25 GMT+0800 (China Standard Time)

@Serhiy-Shekhovtsov, thanks for mentioning me, if I correct, then the input coordinates should always be in real units — mm, not in voxels, and for this purpose, we always store current spacing in meta which updates with zoom.

Serhiy-Shekhovtsov · Answer 13 · Fri Jan 05 2018 00:58:57 GMT+0800 (China Standard Time)

@vessemer the problem we had, the zooming factor got lost after zooming. When params.spacing is 1 and the meta.spacing is 2.5 we will zoom the image and then wipe out the meta.spacing. Then we will don't know how to zoom the coordinates. Also I noticed that there is no use for params.spacing, so I converted it boolean. When it's true we will zoom the image during preprocessing according to meta.spacing but we will also keep that value for future coordinates rescaling.

Anton · Answer 14 · Fri Jan 05 2018 03:36:20 GMT+0800 (China Standard Time)

@Serhiy-Shekhovtsov, Noup, the point is that we don't need to store zooming factor, cause it computed through meta.spacing and params.spacing here. Then meta.spacing updates here. Thus position of a candidate given in mm should be obtained via meta.spacing. Also, one CT scan may be zoomed several times only carry the final spacing not the sequence of zooming factors.
I've already described where params.spacing are used, and taking into account that preprocessing differs for different approaches then params.spacing should be either float or tuple of floats.

Serhiy-Shekhovtsov · Answer 15 · Fri Jan 05 2018 04:06:01 GMT+0800 (China Standard Time)

In current implementation the preprocessing is overriding the meta.spacing making it's impossible to scale the coordinates. I am giving you the real example - when params.spacing is 1 and the meta.spacing is 2.5 we will zoom the image and then set meta.spacing = 1. The 2.5 value is lost after that and we cannot scale coordinates.
Also, I can't see any usage for params.spacing except for this one place, where it is 1, so it doesn't affect the scaling ratio.

Anton · Answer 16 · Fri Jan 05 2018 07:05:54 GMT+0800 (China Standard Time)

If you store coordinates in mm, then you don't need to keep old meta.spacing values. It is correct, that params.spacing usage is currently limited to the one case only, nonetheless, I do not see a point of the restriction you proposed for the params.spacing variable.

Serhiy-Shekhovtsov · Answer 17 · Sat Jan 06 2018 01:00:32 GMT+0800 (China Standard Time)

Coordinates stored in in nonscaled form. So we have to scale them. And for that we need to know the spacing. So the point of suggested change is - preserving properties of meta object.

Serhiy-Shekhovtsov · Answer 18 · Wed Jan 17 2018 02:45:05 GMT+0800 (China Standard Time)

@reubano this issue is solved.

Reuben Cummings · Answer 19 · Thu Jan 18 2018 06:19:45 GMT+0800 (China Standard Time)

@Serhiy-Shekhovtsov still failing for me on master

$ sudo docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py
Starting base ... 
Starting base ... done
======================================== test session starts =========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 6 items 

src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom PASSED
src/tests/test_classification.py::test_classify_real_nodule_small_dicom PASSED
src/tests/test_classification.py::test_classify_dicom_nodule FAILED
src/tests/test_classification.py::test_classify_real_nodule_full_dicom PASSED
src/tests/test_classification.py::test_classify_luna PASSED

============================================== FAILURES ==============================================
_____________________________________ test_classify_dicom_nodule _____________________________________

dicom_path_003 = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def test_classify_dicom_nodule(dicom_path_003, model_path):
>       predicted = trained_model.predict(dicom_path_003, [{'x': 367, 'y': 349, 'z': 72}], model_path)

src/tests/test_classification.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/algorithms/classify/trained_model.py:40: in predict
    return gtr123_model.predict(dicom_path, centroids, model_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ct_path = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
nodule_list = [{'x': 367, 'y': 349, 'z': 72}]
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def predict(ct_path, nodule_list, model_path=None):
        """
    
        Args:
          ct_path (str): path to a MetaImage or DICOM data.
          nodule_list: List of nodules
          model_path: Path to the torch model (Default value = "src/algorithms/classify/assets/gtr123_model.ckpt")
    
        Returns:
          List of nodules, and probabilities
    
        """
        if not model_path:
            CLASSIFY_DIR = path.join(Config.ALGOS_DIR, 'classify')
            model_path = path.join(CLASSIFY_DIR, 'assets', 'gtr123_model.ckpt')
    
        if not nodule_list:
            return []
    
        casenet = CaseNet()
        casenet.load_state_dict(torch.load(model_path))
        casenet.eval()
    
        if torch.cuda.is_available():
            casenet = torch.nn.DataParallel(casenet).cuda()
        # else:
        #     casenet = torch.nn.parallel.DistributedDataParallel(casenet)
    
        preprocess = PreprocessCT(clip_lower=-1200., clip_upper=600., spacing=True, order=1,
                                  min_max_normalize=True, scale=255, dtype='uint8')
    
        # convert the image to voxels(apply the real spacing between pixels)
        ct_array, meta = preprocess(*load_ct(ct_path))
    
        patches = patches_from_ct(ct_array, meta, config['crop_size'], nodule_list,
                                  stride=config['stride'], pad_value=config['filling_value'])
    
        results = []
    
        for nodule, (cropped_image, coords) in zip(nodule_list, patches):
>           cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float())
E           RuntimeError: the given numpy array has zero-sized dimensions. Zero-sized dimensions are not supported in PyTorch

src/algorithms/classify/src/gtr123_model.py:273: RuntimeError
================================ 1 failed, 5 passed in 39.14 seconds =================================

Serhiy-Shekhovtsov · Answer 20 · Sat Jan 20 2018 02:33:49 GMT+0800 (China Standard Time)

@reubano I can't reproduce it on latest version. Can you check it again, please?

Reuben Cummings · Answer 21 · Sat Jan 20 2018 03:13:04 GMT+0800 (China Standard Time)

@Serhiy-Shekhovtsov did you add the new test_classify_dicom_nodule test from the included patch file?

Serhiy-Shekhovtsov · Answer 22 · Sat Jan 20 2018 03:46:57 GMT+0800 (China Standard Time)

@reubano the patch is, actually, invalid:

test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

So, I have fixed that test by creating two new tests in this commit.

Reuben Cummings · Answer 23 · Sat Jan 20 2018 03:52:42 GMT+0800 (China Standard Time)

@Serhiy-Shekhovtsov that's right! You did mention that before. I'll edit the original issue so others don't fall back into the same trap. Thanks!