google / uncertainty-baselines

High-quality implementations of standard and SOTA methods on a variety of tasks.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Diabetic Retinopathy] I cannot find normalisation nor data augmentation code

tjiagoM opened this issue · comments

Hello,

I'm trying to use and understand your code on the diabetic retinopathy use case. I'm quite confused about your preprocessing steps because you seem to not scale the input from [0-255] to [0-1]. For instance, in this line of your preprocessing steps you say "A pre-process function to return images in [0, 1]". However, you don't seem to be doing that here or anywhere else. You also seem to not do any data augmentation during training anymore, is this correct?

Can you please help me clarifying this? In your previous repository it was clear you were normalising the inputs between 0-1, as well as doing data augmentation, so I'm confused of why this is not the case anymore.

Thanks!

Hello, and thank you for your interest in the baselines!

The comment to which you link does seem to be outdated, as preprocessing is already done when the dataset is constructed, e.g., with

python3
>> import uncertainty_baselines as ub
>> dr_builder = ub.datasets.get(
    'diabetic_retinopathy_detection', split='train', data_dir=f'/home/data')
>> dr_builder._dataset_builder.download_and_prepare()

We will be releasing significantly overhauled Diabetic Retinopathy code shortly and will change this comment.

We now use the btgraham-300 processing of the 2015 Kaggle competition winner, which involves subtracting the local average color from the images, meaning the images are preprocessed to have zero-mean in any local region.

You are correct that we are not currently using data augmentation during training, we hope to add options for this soon.

Hi @nband and thank you so much for your answer.

I checked the code you pointed in the tensorflow_datasets library, but I think what you refer to as _subtract_local_average() doesn't seem to do what you are saying. From what I see, it just adds a gaussian blurred image on top.

If I run the following code, I believe what I say is correct. Notice how the corresponding tensor has values up till 255, instead of [0, 1]:

import tensorflow as tf
import tensorflow_datasets as tfds

def preprocess_img(x, y=None):
    input_x, input_y = x['image'], x['label']

    input_x = tf.cast(input_x, tf.float32)
    input_x = tf.image.resize_with_pad(input_x, 512, 512, method='bilinear')

    return input_x, input_y

builder = tfds.builder(name='diabetic_retinopathy_detection/btgraham-300')
builder.download_and_prepare()
ds_val = builder.as_dataset(split='validation', shuffle_files=False)
ds_val = ds_val.map(preprocess_img)
ds_val = ds_val.batch(50)

for batch_img, labels in ds_val:
    print(batch_img.shape, '\n', tf.math.reduce_max(batch_img), '\n', tf.unique(tf.reshape(batch_img, -1)))
    break

The output I get is:

(50, 512, 512, 3) 
 tf.Tensor(255.0, shape=(), dtype=float32) 
 Unique(y=<tf.Tensor: shape=(7520137,), dtype=float32, numpy=
array([  0.     , 128.     , 128.92969, ...,  77.83664,  92.72637,
       111.37043], dtype=float32)>, idx=<tf.Tensor: shape=(39321600,), dtype=int32, numpy=array([0, 0, 0, ..., 0, 0, 0], dtype=int32)>)

@nband I found the issue!! Your function uses tf.image.convert_image_dtype() which automatically scales to [0, 1], whereas my tf.cast() doesn't (indeed I was dividing myself by 255 before, and missed this detail from your code).

Thanks for the attention and sorry for the confusion. I look forward to the data augmentation code!