theislab / scgen

Single cell perturbation prediction

Home Page:https://scgen.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

labels_key mismatch in scgen.SCGEN.setup_anndata()

johnjeang opened this issue · comments

Hello, I am trying to run batch correction on some single-cell RNAseq data. This data has some cell type labels, so I thought the scGEN method would be a good fit.

For reference I am following the google colab tutorial here https://colab.research.google.com/github/theislab/scgen/blob/master/docs/tutorials/scgen_batch_removal.ipynb#scrollTo=OMMhgkQlpb8s

When trying to run

scgen.SCGEN.setup_anndata(train, batch_key="source", labels_key="cell_type")

I get the following error related to my labels that indicates some kind of mismatch. The label names are actually the same, but it looks like there is some data structure issue causing this mismatch error?

ValueError: Making .obs["cell_type"] categorical failed. Expected categories: ['astro' 'endothelial' 'microglia' 'neuron' 'oligo' 'opc' 'tcell' 'unknown']. Received categories: Index(['astro', 'endothelial', 'microglia', 'neuron', 'oligo', 'opc', 'tcell', 'unknown'], dtype='object').

I am working in
Python 3.8.3
scanpy 1.9.1
scgen 2.1.0

Any ideas on how to solve this issue?