Feature Request: Support custom and non-square input sizes
j99ca opened this issue · comments
According to the docs, the input sizes supported by OTX is a set list of square input sizes. With most convolutional model architectures, it should be possible to use non-square input sizes while maintaining the use of pre-trained weights, through a global pooling layer at the head of the model. This is possible with some classification models in TensorHub, and it would be a great feature for OTX classification and would accelerate our adoption of this library at the edge. I have use cases for very tall images from certain sensors where resizing them to any of the set list of square sizes skews the aspect ratio and can destroy the features needed for classification.
@eunwoosh Let's consider non-square input size.
@goodsong81 @eunwoosh do you folks have a timeline for custom inputs (with or without non-square inputs) in this library? I am trying to schedule some integration into OTX 2.x and the lack of this feature is blocking.
Keep up the good work!
@goodsong81 @eunwoosh do you folks have a timeline for custom inputs (with or without non-square inputs) in this library? I am trying to schedule some integration into OTX 2.x and the lack of this feature is blocking.
Keep up the good work!
Not yet confirmed but I suppose it will be enabled in the next quarter (Q3) of this year.
@goodsong81 I see that this PR got merged: #3759
Could that input_size parameter be used instead of fixed values in the model scripts? E.g. in MobileNetV3Base
:
class MobileNetV3Base(ModelInterface):
"""Base model of MobileNetV3."""
def __init__(
self,
num_classes: int = 1000,
width_mult: float = 1.0,
in_channels: int = 3,
input_size: tuple[int, int] = (224, 224),
dropout_cls: nn.Module | None = None,
pooling_type: str = "avg",
feature_dim: int = 1280,
instance_norm_first: bool = False,
self_challenging_cfg: bool = False,
**kwargs,
):
as well as associated export code? E.g. MobileNetV3ForMulticlassCls
@property
def _exporter(self) -> OTXModelExporter:
"""Creates OTXModelExporter object that can export the model."""
return OTXNativeModelExporter(
task_level_export_parameters=self._export_parameters,
input_size=(1, 3, 224, 224),
mean=(123.675, 116.28, 103.53),
std=(58.395, 57.12, 57.375),
resize_mode="standard",
pad_value=0,
swap_rgb=False,
via_onnx=False,
onnx_export_configuration=None,
output_names=["logits", "feature_vector", "saliency_map"] if self.explain_mode else None,
)
Hi @j99ca , #3759 is preparation step for configurable input size. That PR just enables transforms in recipe to use $(input_size)
. I'm now implementing configurable input size using #3759.
Currently, there is no input size configuration interface which updates both model and dataset, so if you want to do that, it's needed to change model class code which includes init argument or exporter part as you said.
#3788 is merged. OTX supports non-square input size now.