juglab / n2v

This is the implementation of Noise2Void training.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A question about training with RGB images

Vandermode opened this issue · comments

Hi, nice work. I look at the example 2D and find you augment data with shape N x H x W x 1 by a mask such that the final input size would be N x H x W x 2. When it applies to RGB image, is that mean we should augment the input image size to N x H x W x 6? Thank you in advance.

Hi @Vandermode
Apologies for the delayed answer.

The input (X) should just be the normal image i.e. N x H x W x 1 for grayscale and N x H x W x 3 for RGB images. We decided attach the mask to the target (Y), since the mask is only used during training. In the case of a RGB image 3 additional masks are needed, because every channel gets its own mask

self.Y_Batches[(j, *coords[k], self.n_chan+c)] = 1
. This is possible because the noise is independent across channels.

It just occurred to me that our current implementation is not optimal. We mask the same pixel in all channels

for k in range(len(coords)):
for c in range(self.n_chan):
self.Y_Batches[(j, *coords[k], c)] = y_val[k][c]
self.Y_Batches[(j, *coords[k], self.n_chan+c)] = 1
self.X_Batches[(j, *coords[k], c)] = x_val[k][c]
, which is not required. I will investigate and fix this issue.