dongyangli-del / EEG_Image_decode

Using vision-language models to decode natural image perception from non-invasive brain recordings.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

test dataset

xuchengjian632 opened this issue · comments

May I ask why in def getitem() index_n_sub_test has to be multiplied by 80 and test_index has to be divided by 80 when you have already averaged 80 repetitions of the test data in EEG_dataset() in EEG_Image_decode/Generation/eegdatasets_leaveone.py.
1
2

@xuchengjian632 This is an engineering detail. We need to average the EEG data here, but the EEG labels still need to be correct. This is written to ensure that the corresponding label is found every time and can better match the data with the label.

@xuchengjian632 This is an engineering detail. We need to average the EEG data here, but the EEG labels still need to be correct. This is written to ensure that the corresponding label is found every time and can better match the data with the label.

But here after data processing data_list and label_list after cating the data dimensions are (200, 63, 250) and (200, ) respectively, the EEG data and labels don't already correspond to each other?
0

@xuchengjian632 Obviously, (200, 63, 250) represents 200 categories, each category has 1 sample, and this dimension is omitted; each sample has 63 channels, and the length of each sample is 250;
(200, ) represents 200 categories, each category has 1 sample label, so this dimension is also omitted.

@dongyangli-del I understand the meaning of data dimensionality. What I mean is that, with the test dataset obtained, after being processed by the load_data() function, the EEG data and labels have already been matched (both the data and labels have the first dimension value of 200). So, is it necessary to divide by 80 again in the getitem(self, index)?

@xuchengjian632 Thank you for your question.
In our code, we considered a more general situation, because it is possible that the best results cannot be obtained by averaging the repetitions of all test sets.
According to the conclusions in the paper by song et al., the best results can be achieved on average at 55 repetitions. Our code provides scalability for this scaling low test.

image

Citations
Song, Yonghao, et al. "Decoding Natural Images from EEG for Object Recognition." arXiv preprint arXiv:2308.13234 (2023).

@dongyangli-del So, does the code for obtaining data here not need to divide by 80 when normally acquiring test set data?

Hi @xuchengjian632, I think you didn't figure out the basic logic of the code. If you have other questions, you can add my wechat:KeepRevere2Nature.
Now that the original version of this issue has been resolved, I will close this issue.

@dongyangli-del Okay, thank you for your response.