single-cell graph attentional clustering
Python --- 3.6.4
Tensorflow --- 1.12.0
Keras --- 2.1.0
Numpy --- 1.19.5
Scipy --- 1.5.4
Pandas --- 1.1.5
Sklearn --- 0.24.2
All the original tested datasets (Yan, Biase, Klein, Romanov, Muraro, Björklund, PBMC, Zhang, Guo, Brown.1, Brown.2, Chung, Sun.1, Sun.2, Sun.3 and Habib) can be downloaded.
For example, the original expression matrix ori_data.tsv
of dataset Biase is downloaded and put into /data/Biase
. Before clustering, low-quality cells and genes can be filtered by running the following command:
python preprocess.py Biase
And a pre-processed expression matrix data.tsv
is produced under /data/Biase
.
To use scGAC, you should specify the two parameters, dataset_str
and n_clusters
, and run the following command:
python scGAC.py dataset_str n_clusters
where dataset_str
is the name of dataset and n_clusters
is the number of clusters.
For example, for dataset Biase
, you can run the following command:
python scGAC.py Biase 3
For your own dataset named Dataset_X
, you can first create a new folder under /data
, and put the expression matrix file data.tsv
into /data/Dataset_X
, then run scGAC on it.
Please note that we recommend you use the raw count
expression matrix as the input of scGAC.
You can obtain the predicted clustering result pred_DatasetX.txt
and the learned cell embeddings hidden_DatasetX.tsv
under the folder /result
.
To see the optional parameters, you can run the following command:
python scGAC.py -h
For example, if you want to evaluate the clustering results (by specifing --subtype_path
) and change the number of nearest neighbors (by specifing --k
), you can run the following command:
python scGAC.py Biase 3 --subtype_path data/Biase/subtype.ann --k 4
Results in the paper were obtained with default parameters.