labels_key mismatch in scgen.SCGEN.setup_anndata()
johnjeang opened this issue · comments
Hello, I am trying to run batch correction on some single-cell RNAseq data. This data has some cell type labels, so I thought the scGEN method would be a good fit.
For reference I am following the google colab tutorial here https://colab.research.google.com/github/theislab/scgen/blob/master/docs/tutorials/scgen_batch_removal.ipynb#scrollTo=OMMhgkQlpb8s
When trying to run
scgen.SCGEN.setup_anndata(train, batch_key="source", labels_key="cell_type")
I get the following error related to my labels that indicates some kind of mismatch. The label names are actually the same, but it looks like there is some data structure issue causing this mismatch error?
ValueError: Making .obs["cell_type"] categorical failed. Expected categories: ['astro' 'endothelial' 'microglia' 'neuron' 'oligo' 'opc' 'tcell' 'unknown']. Received categories: Index(['astro', 'endothelial', 'microglia', 'neuron', 'oligo', 'opc', 'tcell', 'unknown'], dtype='object').
I am working in
Python 3.8.3
scanpy 1.9.1
scgen 2.1.0
Any ideas on how to solve this issue?