Issue with image_extension when parameter use-hdf5 is used
aisosalo opened this issue · comments
There is an issue in run_producer
with image_extension
when use-hdf5
is added as a parameter in run.sh
.
Traceback:
Traceback (most recent call last):
File "src/heatmaps/run_producer.py", line 392, in <module>
main()
File "src/heatmaps/run_producer.py", line 388, in main
produce_heatmaps(model, device, parameters)
File "src/heatmaps/run_producer.py", line 344, in produce_heatmaps
making_heatmap_with_large_minibatch_potential(parameters, model, exam_list, device)
File "src/heatmaps/run_producer.py", line 270, in making_heatmap_with_large_minibatch_potential
all_patches, all_cases = sample_patches(exam, parameters)
File "src/heatmaps/run_producer.py", line 223, in sample_patches
parameters=parameters,
File "src/heatmaps/run_producer.py", line 240, in sample_patches_single
parameters,
File "src/heatmaps/run_producer.py", line 102, in ori_image_prepare
image = loading.load_image(image_path, view, horizontal_flip)
File "src/data_loading/loading.py", line 59, in load_image
image = read_image_mat(image_path)
File "src/utilities/reading_images.py", line 37, in read_image_mat
data = h5py.File(file_name, 'r')
File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 312, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 142, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 78, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = 'sample_output/cropped_images/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
Issue seems to go away by hard-coding here
def get_image_path(short_file_path, parameters):
"""
Convert short_file_path to full file path
"""
return os.path.join(parameters['original_image_path'], short_file_path + 'png')
The intention has probably been not to use use-hdf5
parameter at all, but it is listed in run_producer
and it does allow the script to be modified to save also in png format (e.g. for visualization purposes) by adding here
saving_images.save_image_as_png(img_as_ubyte(heatmap_malignant), os.path.join(
parameters['save_heatmap_path'][0],
short_file_path + '.png
))
saving_images.save_image_as_png(img_as_ubyte(heatmap_benign), os.path.join(
parameters['save_heatmap_path'][1],
short_file_path + '.png'
))
There is a somewhat similar issue in run_model
with image_extension
when use-hdf5
is added as a parameter in run.sh
.
Traceback:
Traceback (most recent call last):
File "src/modeling/run_model.py", line 238, in <module>
main()
File "src/modeling/run_model.py", line 233, in main
parameters=parameters,
File "src/modeling/run_model.py", line 189, in load_run_save
predictions = run_model(model, device, exam_list, parameters)
File "src/modeling/run_model.py", line 82, in run_model
horizontal_flip=datum["horizontal_flip"],
File "src/data_loading/loading.py", line 59, in load_image
image = read_image_mat(image_path)
File "src/utilities/reading_images.py", line 37, in read_image_mat
data = h5py.File(file_name, 'r')
File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 312, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "env_nyukat/lib/python3.6/site-packages/h5py/_hl/files.py", line 142, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 78, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = 'sample_output/cropped_images/0_L_CC.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
Perhaps the safest solution is to hard-code here the correct file extension
loaded_image = loading.load_image(
image_path=os.path.join(parameters["image_path"], short_file_path + ".png"),
view=view,
horizontal_flip=datum["horizontal_flip"],
)
Hi @aisosalo, I'm not 100% sure I understand your issue. The purpose of use-hdf5 is to read inputs that are in hdf5 format. We have not currently provided any hdf5 format sample inputs with the repository. It sounds like the functionality you're looking for is writing hdf5 formats instead? Let me know if I'm mischaracterizing your issue.
Thank you for your answer, it resolved my issue. I clearly misunderstood the purpose of the use-hdf5
parameter.
My purpose was to make a script to ease monitoring the heatmap generation using PyCharm:
"""
Method adapted from breast_cancer_classifier function `run_producer` by
Nan Wu, Jason Phang, Jungkyu Park, Yiqiu Shen, Zhe Huang, Masha Zorin,
Stanisław Jastrzębski, Thibault Févry, Joe Katsnelson, Eric Kim, Stacey Wolfson, Ujas Parikh,
Sushma Gaddam, Leng Leng Young Lin, Kara Ho, Joshua D. Weinstein, Beatriu Reig, Yiming Gao,
Hildegard Toth, Kristine Pysarenko, Alana Lewin, Jiyon Lee, Krystal Airola, Eralda Mema,
Stephanie Chung, Esther Hwang, Naziya Samreen, S. Gene Kim, Laura Heacock, Linda Moy,
Kyunghyun Cho, and Krzysztof J. Geras , which is licensed under a GNU Affero General Public License v3.0.
See: https://github.com/nyukat/breast_cancer_classifier/blob/master/LICENSE
"""
import sys
import os
import random
import argparse
from src.heatmaps.run_producer import produce_heatmaps
from src.heatmaps.run_producer import load_model
print(sys.version, sys.platform, sys.executable)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Generate heatmaps')
parser.add_argument('--exam-list-path', default='sample_output/exam_list.pkl')
parser.add_argument('--image-path', default='sample_output/cropped_images')
parser.add_argument('--output-heatmap-path', default='sample_output/heatmaps')
parser.add_argument('--model-path', default='models/sample_patch_model.p')
parser.add_argument('--batch-size', default=100, type=int)
parser.add_argument('--use-hdf5', choices=[False, True], default=False)
parser.add_argument('--device-type', choices=['gpu', 'cpu'], default='gpu')
parser.add_argument('--gpu-number', type=int, default=0)
parser.add_argument('--seed', default=0, type=int)
args = parser.parse_args()
# Set the seed
random.seed(args.seed)
params = dict(
device_type=args.device_type,
gpu_number=args.gpu_number,
patch_size=256,
stride_fixed=70,
more_patches=5,
minibatch_size=args.batch_size,
seed=args.seed,
initial_parameters=args.model_path,
input_channels=3,
number_of_classes=4,
data_file=args.exam_list_path,
original_image_path=args.image_path,
save_heatmap_path=[os.path.join(args.output_heatmap_path, 'heatmap_malignant'),
os.path.join(args.output_heatmap_path, 'heatmap_benign')],
heatmap_type=[0, 1],
use_hdf5=args.use_hdf5 # when using hdf5 format sample inputs
)
# Get model
model, device = load_model(params)
# Generate heatmaps in the chosen format
produce_heatmaps(model, device, params)
What might be the possible benefits of using hdf5
format mammogram images as an input to the network? Is it something to consider when fine-tuning the pre-trained models for a different dataset?
It shouldn't matter which format you use as long as you correctly replace the part of the code which is loading the images. We chose that format simply because it was the fastest to load when we tested it on our cluster.