Trainer AssertionError
ViriatoII opened this issue · comments
I can't overcome this assertion error:
~/.local/lib/python3.8/site-packages/deepsort/trainer.py in __load_train_data(self, files, save_dir)
239 # filter out cells not in label-text
240 df = df.iloc[filter_cell]
--> 241 assert cell2type['cell'].tolist() == df.index.tolist()
242 df = df.rename(columns=gene2id)
243 # filter out useless columns if exists (when using gene intersection)
AssertionError:
This is probably because my input data does not perfectly match the format that it expects. How do the tutorial files ('/path/to/human_brain_data_1.csv', '/path/to/human_brain_celltype_1.csv') exactly look like?
Here are my inputs:
x_train.iloc[:,:3]
cell | 8194 | 6001 | 4401 |
---|---|---|---|
CD4 | 0.0 | 0.293348 | 0.0 |
DCN | 0.0 | 0.157392 | 0.0 |
... | ... | ... | |
FCGBP | 0.0 | 0.000000 | 0.0 |
GJA5 | 0.0 | 0.000000 | 0.0 |
95 rows × 5584 columns
cell | type | id | |
---|---|---|---|
8090 | 8194 | t-cells | 0 |
5921 | 6001 | kupffer cells | 2 |
... | ... | ... | |
860 | 862 | kupffer cells | 2 |
7270 | 7360 | hepatic stellate cells | 7 |
5584 rows × 3 columns
Thank you,
Solved it with 2 things:
Replaced the columns of the first table with the corresponding value/name of cell in the 2nd table.
Added a line to your code, before the assertion, to force cell names to be strings, as mine were interpreted as integers.
~/.local/lib/python3.8/site-packages/deepsort/trainer.py
--->243 cell2type['cell'] = cell2type['cell'].map(str)