ZJUFanLab / scDeepSort

Cell-type Annotation for Single-cell Transcriptomics using Deep Learning with a Weighted Graph Neural Network

Home Page:https://doi.org/10.1093/nar/gkab775

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Trainer AssertionError

ViriatoII opened this issue · comments

I can't overcome this assertion error:

~/.local/lib/python3.8/site-packages/deepsort/trainer.py in __load_train_data(self, files, save_dir)
    239             # filter out cells not in label-text
    240             df = df.iloc[filter_cell]
--> 241             assert cell2type['cell'].tolist() == df.index.tolist()
    242             df = df.rename(columns=gene2id)
    243             # filter out useless columns if exists (when using gene intersection)

AssertionError: 

This is probably because my input data does not perfectly match the format that it expects. How do the tutorial files ('/path/to/human_brain_data_1.csv', '/path/to/human_brain_celltype_1.csv') exactly look like?

Here are my inputs:

x_train.iloc[:,:3]
cell 8194 6001 4401
CD4 0.0 0.293348 0.0
DCN 0.0 0.157392 0.0
... ... ...
FCGBP 0.0 0.000000 0.0
GJA5 0.0 0.000000 0.0

95 rows × 5584 columns

  cell type id
8090 8194 t-cells 0
5921 6001 kupffer cells 2
... ... ...
860 862 kupffer cells 2
7270 7360 hepatic stellate cells 7

5584 rows × 3 columns

Thank you,

Solved it with 2 things:
Replaced the columns of the first table with the corresponding value/name of cell in the 2nd table.
Added a line to your code, before the assertion, to force cell names to be strings, as mine were interpreted as integers.

~/.local/lib/python3.8/site-packages/deepsort/trainer.py

--->243 cell2type['cell'] = cell2type['cell'].map(str)