A repository containing datasets and tools to train a watermark classifier.
The datasets folder contains WebDataset files using the following keys:
- key: A unique identifier within the dataset
- url: The URL of the image data
- caption: The caption describing the images
Preliminary dataset annotations and WIP tools can be found on https://github.com/robvanvolt/DALLE-tools. More mature tools and completely annotated datasets will be transferred to this repository. Feel free to adjust the structure, upload new annotations or annotation tools.
WIP - A tool to annotate the url-caption datasets online will be shortly uploaded.