This the reproducible code for the islet paper.
This folder contains the code and data used for the sRNA-seq analysis
Processed beta cell scRNA-seq dataset is stored in cowtransfer
After ML quality controlled, beta cell scRNA-seq dataset is stored in cowtransfer
*.R, r code for downstream analysis of scRNA-seq, including quality control,clustering, cell annotation,
The well-annotated scRNA-seq datsets was trained by scVI and scANVI to learn cell representation.
The trained h5ad file is stored in cowtransfer
This folder contains the code and data used for the scATAC-seq analysis (multiome + scATAC-seq).
Processed beta cell scATAC-seq data is stored in cowtransfer
*.smk data, snakemake scripts for cellranger *.R, r code for downstream analysis of scATAC-seq, including quality control, clustering, cell annotation and peak calling
For ChIP-seq data, we only run the basic upstream analysis, such quality control and mapping. The bam file of H3H27ac modification will used for ABC model input.
For HiC data, we could run the basic upstream analysis, such quality control, mapping (This workflow is reference from Renlab ). The hic file will used for ABC model input.
First, split the scATAC bam file based on the cell type information. Then generate the gene expression (TPM) from multiome dataset from Wang et al. 2023
After prepare the all data, using code in run_ABC.Rmd to generate enhancer promoter interactions with ABC model
Then using the motifmachr to assign the TF to the corresponding cis-regulatory element
Due to the heterogeneity of human data, we found some of the donors with the discrepancy gene expression profile, which exhibit the extremely low predicted accuracy rates (15%). We decided run iterative XGboost to remove these donors until no donor with low predicted accuracy rates, then calculate the differently expressed genes.