csbg / neuroblastoma

Single-cell transcriptomics and epigenomics unravel the role of monocytes in neuroblastoma bone marrow metastasis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Single-cell transcriptomics and epigenomics unravel the role of monocytes in neuroblastoma bone marrow metastasis

DOI

This code supplements the publication by Fetahu, Esser-Skala, Dnyansagar et al (2023).

Folders

(Not all of these folders appear in the git repository.)

  • data_generated: output files generated by the scripts in this repository
  • data_raw: raw input data
  • doc: project documentation
  • literature: relevant publications
  • metadata: additional required data
  • misc: miscellaneous scripts
  • plots: generated plots
  • renv: R environment data
  • scatac: scripts for scATAC-seq analysis
  • tables: exported supplementary tables and data; the subfolder source_data contains source data for figures

Download data

Create a folder data_raw that will contain raw data in the following subfolders:

  • adrmed:
  • rna_seq: Download GSE216155_RAW.tar from GEO Series GSE216155 and extract all files.
  • atac_seq: Download all files from GEO Series GSE216175 (GSE216175_barcodes.tsv.gz, GSE216175_barcodes_samples.csv.gz, GSE216175_filtered_peak_bc_matrix.h5, GSE216175_matrix.mtx.gz, GSE216175_peaks.bed.gz, and GSE216175_RAW.tar). Extract all files from the tarball.
  • GSE137804: download the following files from GEO series GSE137804:
    • GSE137804_tumor_dataset_annotation.csv.gz
    • GSE137804_RAW.tar, from which the following eleven files must be extracted:
      • GSM4088774_T10_gene_cell_exprs_table.xls.gz
      • GSM4088776_T27_gene_cell_exprs_table.xls.gz
      • GSM4088777_T34_gene_cell_exprs_table.xls.gz
      • GSM4088780_T69_gene_cell_exprs_table.xls.gz
      • GSM4088781_T71_gene_cell_exprs_table.xls.gz
      • GSM4088782_T75_gene_cell_exprs_table.xls.gz
      • GSM4088783_T92_gene_cell_exprs_table.xls.gz
      • GSM4654669_T162_gene_cell_exprs_table.xls.gz
      • GSM4654672_T200_gene_cell_exprs_table.xls.gz
      • GSM4654673_T214_gene_cell_exprs_table.xls.gz
      • GSM4654674_T230_gene_cell_exprs_table.xls.gz
  • snp_array: Extract the contents of snp_array.tgz provided in Zenodo repository https://doi.org/10.5281/zenodo.7707614

Optionally, obtain intermediary data: Extract the contents of R_data_generated.tgz from Zenodo repository https://doi.org/10.5281/zenodo.7707614 to folder data_generated.

scRNA-seq analysis

Main workflow

Run these R scripts in the given order to generate all files required by figures and tables.

Plotting functions

Run these R scripts in arbitrary order to generate publication figures and tables:

Other scripts

scATAC-seq analysis

All required scripts are in subfolder scatac.

scATAC-seq workflow

scATAC-seq scRNA-seq integration workflow

For data integration we used scGLUE (Graph Linked Unified Embedding) model for unpaired single-cell multi-omics data integration (https://scglue.readthedocs.io/en/latest/). We followed the detailed tutorial at https://scglue.readthedocs.io/en/latest/tutorials.html. Before the tutorial we needed to convert the objects in anndata format from SingleCellExperiment and Seurat for scRNA-seq and scATAC-seq respectively. There are many tools available to do this and we are sharing our approach for format conversion, namely monocle_to_anndata.R and Seurat_to_anndata.R.

The following Jupyter notebooks follow the notebooks of the scGLUE integration pipeline.

Finally, Figures.R generates publication figures.

About

Single-cell transcriptomics and epigenomics unravel the role of monocytes in neuroblastoma bone marrow metastasis

License:MIT License


Languages

Language:Jupyter Notebook 95.3%Language:R 4.7%Language:Python 0.0%