Working of ladi-vton with the DressCode dataset

Question

Working of ladi-vton with the DressCode dataset

kira1433 opened this issue a year ago · comments

The model trained on VITON-HD is amazing, it definitely is one of the best
Is it just me, or the results generated using Dresscode no where near the VITON-HD model.
The model not being able to work with Textual details on t-shirts is understandable.
But I have actually written a pre-processing pipeline in Colab and run inference on custom data, and even a little miss in the parsing's generates a totally bad image. I am now getting perfect pre-processings with the size of data and positioning also right but the Dresscode model literally doesn't work with faces. All the faces in the final result are distorted as attached below . Aren't the EMASC modules supposed to restore the face?

Is there anything which can be done to solve this issue or is it not possible?

****

Rishi · Answer 1 · Sat Jul 15 2023 11:58:45 GMT+0800 (China Standard Time)

@ kira1433 could you share pre-processing pipeline Colab for VITON-HD? I am having issues with pre-processing for image-parse-agnostic-v3.2 on custom data. I have been stuck on this for a while now.

K Aashish Chandra · Answer 2 · Sat Jul 15 2023 16:27:40 GMT+0800 (China Standard Time)

@ kira1433 could you share pre-processing pipeline Colab for VITON-HD? I am having issues with pre-processing for image-parse-agnostic-v3.2 on custom data. I have been stuck on this for a while now.

Sorry , but I only made it for the DressCode model and it needs a few thing in my google drive to work now.
U can check out https://github.com/sangyun884/HR-VITON/blob/main/Preprocessing.md and sangyun884/HR-VITON#45 for something which can work for the VITON-HD dataset.

Rishi · Answer 3 · Sat Jul 15 2023 16:37:44 GMT+0800 (China Standard Time)

No Problem, I already followed the steps in both the guides but somehow agnostic-v-3.2 is giving bad output.

K Aashish Chandra · Answer 4 · Sat Jul 15 2023 16:54:41 GMT+0800 (China Standard Time)

RishiGitH

can u send all the parsing's which u have generated, because the get_agnostic.py on their github is what they have used for generating agnostic for the dataset as well so I dont think thats the issue.

Rishi · Answer 5 · Sat Jul 15 2023 16:59:36 GMT+0800 (China Standard Time)

test_4.zip
Here are all the parsing I have generated

K Aashish Chandra · Answer 6 · Sat Jul 15 2023 17:18:40 GMT+0800 (China Standard Time)

test_4.zip Here are all the parsing I have generated

I think its a problem with the image you are using since the hands are not visible at all, but the code requires it to do so. I think its an edge case, so you shouldn't worry about it.

Rishi · Answer 7 · Sat Jul 15 2023 17:47:34 GMT+0800 (China Standard Time)

@kira1433 thanks a lot for the insight was stuck on this so would you recommend to use a new photo altogether or can we make it work somehow as well

Alberto Baldrati · Answer 8 · Sat Jul 15 2023 17:58:50 GMT+0800 (China Standard Time)

The model trained on VITON-HD is amazing, it definitely is one of the best Is it just me, or the results generated using Dresscode no where near the VITON-HD model. The model not being able to work with Textual details on t-shirts is understandable. But I have actually written a pre-processing pipeline in Colab and run inference on custom data, and even a little miss in the parsing's generates a totally bad image. I am now getting perfect pre-processings with the size of data and positioning also right but the Dresscode model literally doesn't work with faces. All the faces in the final result are distorted as attached below . Aren't the EMASC modules supposed to restore the face?

Is there anything which can be done to solve this issue or is it not possible?

****

Hi @kira1433
Thanks for your interest in our work

We've reviewed the image you provided, and it seems that there is an issue. However, without knowing the specific preprocessing pipeline you used, it's difficult to pinpoint the problem.

Regarding the EMASC Module, you are correct in noting that its purpose is to restore the regions of the image outside the inpainting area.

Additionally, it's important to note that our trained models have been tested on the DressCode and VITON-HD datasets. While they perform well on these datasets, their performance on custom data, especially outside these datasets, may vary.

Alberto

K Aashish Chandra · Answer 9 · Sat Jul 15 2023 20:31:48 GMT+0800 (China Standard Time)

The model trained on VITON-HD is amazing, it definitely is one of the best Is it just me, or the results generated using Dresscode no where near the VITON-HD model. The model not being able to work with Textual details on t-shirts is understandable. But I have actually written a pre-processing pipeline in Colab and run inference on custom data, and even a little miss in the parsing's generates a totally bad image. I am now getting perfect pre-processings with the size of data and positioning also right but the Dresscode model literally doesn't work with faces. All the faces in the final result are distorted as attached below . Aren't the EMASC modules supposed to restore the face?
Is there anything which can be done to solve this issue or is it not possible?
****

Hi @kira1433 Thanks for your interest in our work

We've reviewed the image you provided, and it seems that there is an issue. However, without knowing the specific preprocessing pipeline you used, it's difficult to pinpoint the problem.

Regarding the EMASC Module, you are correct in noting that its purpose is to restore the regions of the image outside the inpainting area.

Additionally, it's important to note that our trained models have been tested on the DressCode and VITON-HD datasets. While they perform well on these datasets, their performance on custom data, especially outside these datasets, may vary.

Alberto

Thank you for the reply , here is the notebook i have used, it probably wont work now since I also changed some of the tools I used so my drive is also needed. Ill send you the note book and parsings for some of the pictures I have used instead. https://drive.google.com/file/d/1WTQj4RDKlJWqFVuduoz06B7kmzWErF0v/view?usp=sharing.
Below are the parsings, I dont see anything wrong with them, and I dont think they should influence the final face generated. I also haven't changed anything in the code mostly but I did use num of inference steps as 25.

Since the training was done on a dataset with no faces, I think that is possibly the reason faces are distorted and we can't do anything about it. Or do you have any suggestion for this?

K Aashish Chandra · Answer 10 · Sat Jul 15 2023 20:35:03 GMT+0800 (China Standard Time)

@kira1433 thanks a lot for the insight was stuck on this so would you recommend to use a new photo altogether or can we make it work somehow as well

You should go for other images. But if this photo is very important for you then you can maybe change the agnostic code a little bit so that your image also works.

Emad · Answer 11 · Sun Jul 16 2023 19:22:39 GMT+0800 (China Standard Time)

Hi, I also did the same and had the same issue. In my testing, we even tried recreating perfect condition photos with a backlight and greenscreen but the face was always distorted specefically the upper area of the face.

Rishi · Answer 12 · Sun Jul 16 2023 19:49:26 GMT+0800 (China Standard Time)

@Emadeon did you use VITON-HD ? Because using that my results were bad as well . @ABaldrati I followed everything in terms of preprocessing pipeline. Also one more question should i use cloth sample from the dataset or any is fine

K Aashish Chandra · Answer 13 · Sun Jul 16 2023 20:34:56 GMT+0800 (China Standard Time)

Hi, I also did the same and had the same issue. In my testing, we even tried recreating perfect condition photos with a backlight and greenscreen but the face was always distorted specefically the upper area of the face.

Yeah I agree, were you able to fix this issue somehow ? Or did you decide that extracting the face and keeping it in the final image is better ?

Emad · Answer 14 · Sun Jul 16 2023 20:56:23 GMT+0800 (China Standard Time)

@Emadeon did you use VITON-HD ? Because using that my results were bad as well . @ABaldrati I followed everything in terms of preprocessing pipeline. Also one more question should i use cloth sample from the dataset or any is fine

I used Dresscode.

Hi, I also did the same and had the same issue. In my testing, we even tried recreating perfect condition photos with a backlight and greenscreen but the face was always distorted specefically the upper area of the face.

Yeah I agree, were you able to fix this issue somehow ? Or did you decide that extracting the face and keeping it in the final image is better ?

I thought that the issue maybe was that the dataset dress-code does not have faces included. I just decided to extract the area of the eyes and replaced that because other than the eyes and eyebrows, everything else was fine.

1_corn · Answer 15 · Mon Jul 17 2023 14:27:17 GMT+0800 (China Standard Time)

@ kira1433 could you share pre-processing pipeline Colab for VITON-HD? I am having issues with pre-processing for image-parse-agnostic-v3.2 on custom data. I have been stuck on this for a while now.

Sorry , but I only made it for the DressCode model and it needs a few thing in my google drive to work now. U can check out https://github.com/sangyun884/HR-VITON/blob/main/Preprocessing.md and sangyun884/HR-VITON#45 for something which can work for the VITON-HD dataset.

Can you share the cloud disk link of the dresscode dataset?

K Aashish Chandra · Answer 16 · Mon Jul 31 2023 18:54:50 GMT+0800 (China Standard Time)

@ kira1433 could you share pre-processing pipeline Colab for VITON-HD? I am having issues with pre-processing for image-parse-agnostic-v3.2 on custom data. I have been stuck on this for a while now.

Sorry , but I only made it for the DressCode model and it needs a few thing in my google drive to work now. U can check out https://github.com/sangyun884/HR-VITON/blob/main/Preprocessing.md and sangyun884/HR-VITON#45 for something which can work for the VITON-HD dataset.

Can you share the cloud disk link of the dresscode dataset?

Sorry for the late reply, but you can try filling the form in the Dresscode repository, that is the only way to get access to the dataset.

K Aashish Chandra · Answer 17 · Mon Jul 31 2023 18:59:34 GMT+0800 (China Standard Time)

在 VITON-HD 上训练的模型令人惊叹，它绝对是最好的模型之一，只是我，还是使用 Dresscode 生成的结果与 VITON-HD 模型相差甚远。该模型无法处理 T 恤上的文本细节是可以理解的。但我实际上已经在 Colab 中编写了一个预处理管道，并对自定义数据进行推理，甚至解析中的一点失误都会生成完全糟糕的图像。我现在得到了完美的预处理，数据大小和定位也正确，但 Dresscode 模型实际上不适用于面部。最终结果中的所有面都被扭曲，如下所示。EMASC模块不是应该恢复脸部吗？
有什么办法可以解决这个问题还是不可能？****

你好@kira1433感谢您对我们工作的兴趣
我们已审核您提供的图片，似乎存在问题。但是，如果不知道您使用的具体预处理管道，则很难查明问题。
关于 EMASC 模块，您正确地注意到它的目的是恢复图像修复区域之外的区域。
此外，值得注意的是，我们训练的模型已经在 DressCode 和 VITON-HD 数据集上进行了测试。虽然它们在这些数据集上表现良好，但它们在自定义数据（尤其是这些数据集之外）上的表现可能会有所不同。
阿尔贝托

谢谢您的回复，这是我用过的笔记本，它现在可能无法工作，因为我还更改了我使用的一些工具，所以我的驱动器也需要。我将向您发送笔记本和我所使用的一些图片的解析。https://drive.google.com/file/d/1WTQj4RDKlJWqFVuduoz06B7kmzWErF0v/view?usp=sharing。下面是解析，我没有看到它们有什么问题，而且我认为它们不会影响最终生成的面孔。我也没有对代码进行任何更改，但我确实将推理步骤数设置为 25。
由于训练是在没有人脸的数据集上进行的，我认为这可能是人脸扭曲的原因，而我们对此无能为力。或者您对此有什么建议吗？

您好,我可以和您用邮箱取得联系吗?如果可以,请留下邮箱,谢谢你!

yes, but I dont know Chinese, here is my mail f20210467@hyderabad.bits-pilani.ac.in

MrChen · Answer 18 · Thu Dec 28 2023 17:09:14 GMT+0800 (China Standard Time)

是

你好啊，dresscode的这个数据集可以共享一下吗？