junyongyou / triq

TRIQ implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About koniq-10k dataset

CharlesWu123 opened this issue · comments

hello,I can't open this website http://database.mmsp-kn.de/koniq-10k-database.html to download koniq-10k dataset, do you have any method to download the data set, thanks.

Hello, I have a question, maximum_position_encoding is still too small in a real application scenario, because the size of the input image may be 3000x3000, And I can't set maximum_position_encoding to the value corresponding to the largest image. so I can use a RoI Pooling replace Max Pooling to convert all input images of different sizes into feature maps of the same size, so that they have the same maximum_position_encoding, in this way it can be adapted to various sizes of input images. I don't know if this idea is feasible. Looking forward to your reply

Hello, I have a question, maximum_position_encoding is still too small in a real application scenario, because the size of the input image may be 3000x3000, And I can't set maximum_position_encoding to the value corresponding to the largest image. so I can use a RoI Pooling replace Max Pooling to convert all input images of different sizes into feature maps of the same size, so that they have the same maximum_position_encoding, in this way it can be adapted to various sizes of input images. I don't know if this idea is feasible. Looking forward to your reply

Hi, I have no problem to download the images from the website. Maybe you can try later to use a proxy.

Hello, I have a question, maximum_position_encoding is still too small in a real application scenario, because the size of the input image may be 3000x3000, And I can't set maximum_position_encoding to the value corresponding to the largest image. so I can use a RoI Pooling replace Max Pooling to convert all input images of different sizes into feature maps of the same size, so that they have the same maximum_position_encoding, in this way it can be adapted to various sizes of input images. I don't know if this idea is feasible. Looking forward to your reply

Yes, you can try that. Or you can also use a big pooling size to replace line 146 in transformer_iqa.py.