如何在自建数据集训练并测试？

Question

如何在自建数据集训练并测试？

yoho131 opened this issue 8 months ago · comments

首先感谢开源你的项目代码。
我想问问我该如何在自建数据集运行代码呢？
我在尝试后报错如下

yoho131 · Answer 1 · Wed Jul 17 2024 19:47:22 GMT+0800 (China Standard Time)

额，不好意思，我在阅读代码后能跑起来了，但因为我的数据集图像尺寸是51205120，如果resize为240240效果太差。
所以我将train_seg.py和tesst_seg.py中的--img-cropsize和--img-resize设置为1024，运行后报错
RuntimeError: The size of tensor a (4097) must match the size of tensor b (226) at non-singleton dimension 1，
报错位置如下

其中x.shape为(1,4097,896)，self.positional_embedding.shape为(226,896)
当我再次研究代码后知道了其中226(226=(240/16)^2+1)的源头是此处的image_size和patch_size，如果能将此处的image_size修改为1024，patch_size不变就能得到self.positional_embedding.shape为(4097,896)

所以应该在哪里修改image_size呢？我将此处修改为1024不起作用

FuNz · Answer 2 · Fri Jul 19 2024 12:27:02 GMT+0800 (China Standard Time)

你需要对position embedding 做个上采样，vit 源码里应该是有这部分的实现的

…

------------------ 原始邮件 ------------------ 发件人: yoho131 ***@***.***> 发送时间: 2024年7月17日 19:47 收件人: lxf1293763074 ***@***.***> 主题: Re: [FuNz-0/PromptAD] 如何在自建数据集训练并测试？ (Issue #13) 额，不好意思，我在阅读代码后能跑起来了，但因为我的数据集图像尺寸是51205120，如果resize为240240效果太差。所以我将train_seg.py和tesst_seg.py中的--img-cropsize和--img-resize设置为1024，运行后报错 RuntimeError: The size of tensor a (4097) must match the size of tensor b (226) at non-singleton dimension 1，报错位置如下 image.png (view on web) 其中x.shape为(1,4097,896)，self.positional_embedding.shape为(226,896) 当我再次研究代码后知道了其中226(226=(240/16)^2+1)的源头是此处的image_size和patch_size，如果能将此处的image_size修改为1024，patch_size不变就能得到self.positional_embedding.shape为(4097,896) image.png (view on web) 所以应该在哪里修改image_size呢？我将此处修改为1024不起作用 image.png (view on web) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

yoho131 · Answer 3 · Fri Jul 19 2024 17:24:37 GMT+0800 (China Standard Time)

如果我直接在PromptAD/CLIPAD/transformer.py的此处直接将self.grid_size修改为(64,64)会对检测效果有什么影吗？我尝试修改后代码可以运行，但我不知道是否会对checkpoint的性能造成影响