Do not how what to do with datacard
lilygeorgescu opened this issue · comments
Hi,
Thank you for this nice work and for making it public!
I do not know what to obtain the curated dataset, meaning how to use the datacard to obtain the training data to start the training.
If anyone has any idea, please let me know.
Thanks in advance.
datacard
is a new term we invented for training data distribution
, so it's not the concrete dataset. You prob. need to follow the code in metaclip
to do curation on CommonCrawl to get the full dataset.