How to set output tensor type as Image Type?

Question

How to set output tensor type as Image Type?

dragen1860 opened this issue 2 years ago · comments

Dear author:
I found you can set both the input and output type in Super resolution project to Image type in coreml. Thus it could use preview function in xcode very easily. However, I can not find any ways to set the output multiarray tensor as image type. The online resouce only give details on how to set input as image type. None of those tell how to set output. Thank you.

MLBoy_DaisukeMajima · Answer 1 · Fri Jan 07 2022 07:29:06 GMT+0800 (China Standard Time)

Hi dragen1860.

The way to make output image is like...

import coremltools as ct
from coremltools.proto import FeatureTypes_pb2 as ft
mlmodel = ct.models.MLModel("realesrgan.mlmodel")
spec = mlmodel.get_spec()
builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)

output = builder.spec.description.output[0]

output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('RGB')
output.type.imageType.width = 1280
output.type.imageType.height = 1280

And in Real ESRGAN case, you need add activation layer.
Because this model normalizes values to -1~1.
So, before add image output, you have to add an activation layer that make (value + 1)*127.5

builder.add_squeeze(name="squeeze", input_name="var_4053", output_name="squeeze_out", axes = None, squeeze_all = True)
builder.add_activation(name="activation",non_linearity="LINEAR",input_name="squeeze_out",output_name="activation_out",params=[127.5,127.5]) 
builder.spec.description.output.pop()
builder.spec.description.output.add()
output = builder.spec.description.output[0]
output.name = "activation_out"

If you need more help, ask me.

MLBoy.

Hendrik Holtmann · Answer 2 · Sat Oct 08 2022 06:32:24 GMT+0800 (China Standard Time)

Just getting into converting models to coreml and I am also having a question/issue regarding my custom conversion script for Real ESRGAN. I understand that the output needs an activation layer and your example was great help. I am however unsure regarding the image input. Does it also need a normalisation of values from -1 to 1? I tried with scale of 1/127.5 and bias values of -1 but it gives wrong results. Also tried with a scale of 1/255 and bias of 0, but the image looks washed out. Any hints?