huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Home Page:https://huggingface.co/docs/datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FAISS load to None

brainer3220 opened this issue · comments

Describe the bug

I've use FAISS with Datasets and save to FAISS.

Then load to save FAISS then no error, then ds to None

ds.load_faiss_index('embeddings', 'my_index.faiss')

Steps to reproduce the bug

1.

ds_with_embeddings = ds.map(lambda example: {'embeddings': model(transforms(example['image']).unsqueeze(0)).squeeze()}, batch_size=64)

ds_with_embeddings.add_faiss_index(column='embeddings')

ds_with_embeddings.save_faiss_index('embeddings', 'index.faiss')

2.

ds.load_faiss_index('embeddings', 'my_index.faiss')

Expected behavior

Add column in Datasets.

Environment info

Google Colab, SageMaker Notebook

Hello,

I'm not sure I understand.
The return value of ds.load_faiss_index is None as expected.

I see that loading an Index on a dataset that doesn't have an embedding column doesn't raise an Issue. Is that the issue?

So ds doesn't have an embedding column, but we load an index that looks for it. But this will raise an issue only when calling ds.search.