hkchengrex / XMem

Thank you for your hard work!Due to the existence of working memory (high-resolution feature memory), the XMem network model supports data set image input of any resolution, such as DAVIS's 480x854, or other resolutions,thanks

You can change the size (number of pixels of the shorter side) with an input argument:

XMem/eval.py

Line 63 in 4589acc

parser.add_argument('--size', default=480, type=int,

For example, the resolution of my data set is 1920x1080 pixels, then I set default=1080？ I think I got it, thank you!

You can also set it to "-1" to leave the original resolution untouched. It might use a lot of memory/be slow/do not perform well at such a high resolution though, because it is not trained to do so.

Do I understand this correctly? If I use a 1080p resolution data set for training, I think I only need to fine-tune the path splicing during training, and then iterate the network model weights saved after training to predict, then I set default=1080 Should there be no problems of slow speed/high memory usage/poor segmentation accuracy? I don’t know if I understand it this way, I’m still studying your paper carefully! Maybe the question is a bit basic, thanks again, you helped a lot

You would also need to change to crop size. Global search "384" in the project.
It might help with the worse segmentation accuracy, but not the slow speed/high memory usage.

Regarding the 384x384 patches, I have seen them in STM papers before. Do you mean to adjust the patches a little bit and make them bigger? So how much is appropriate? 384 size patches are too small, causing one picture to be divided into many patches, which will lead to high memory utilization, right? Thank you for your continued replies!

例如，我全局搜索的关于384的这些都需要调整吗 - - ！

I don't know what patch sizes are good. This is just a general recommendation. Surely you can just try the current model first.

非常非常感谢！！！！您的回复非常非常有帮助！我会自己在尝试尝试！！！

Some questions about input image resolution