alex000kim / nsfw_data_scraper

Collection of scripts to aggregate image data for the purposes of training an NSFW Image Classifier

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Bias - Females in Porn

GantMan opened this issue · comments

It seems the data has a bit of bias for females in porn.
Link to issue for reference: infinitered/nsfwjs#16

It seems the scraper should be adjusted to counter-balance this bias.

I don't think there is much that can be done here.
Whatever model you train it will still be a garbage-in-garbage-out type of system.
I did add a disclaimer to the README though.

I feel like more neutral women photos might help. Thoughts?

Of course, it might.
I encourage people to contribute links to various *.txt files in https://github.com/alexkimxyz/nsfw_data_scraper/tree/master/scripts/source_urls/ in order to reduce the overall error and bias.
that said, there will always be some kind of bias.

Sounds good. Hopefully, this will help make a call to action on the data.