Finally start work on `prodigy-embetter`

Question

Finally start work on `prodigy-embetter`

koaning opened this issue a year ago · comments

vincent d warmerdam commented a year ago

python -m prodigy textcat.emb.manual <dataset> <examples.jsonl> --labels --loader --anchors --exclusive
python -m prodigy image.clip.by_text <dataset> <examples.jsonl> --labels --loader --anchors --exclusive --remove-base64
python -m prodigy image.clip.by_image <dataset> <examples.jsonl> --labels --loader --anchors --exclusive --remove-base64

vincent d warmerdam · Answer 1 · Wed Aug 09 2023 16:22:32 GMT+0800 (China Standard Time)

After working on the "frontpage" project, I think this is no longer the best way to go about this. Calculating the embeddings on the fly is expensive and it may be better to have a simple ANN index instead.