BGI-Qingdao / scMultimodal

Multimodal integration of various types of sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scMultimodal

Overview

To run scmulti:

python run.py -o results_dir \
        --rna scrna.anno.h5ad \
        --rds scrna.anno.rds \
        --atac atac.anno.h5ad \
        --name projectA \
        --fragment fragments.tsv.gz \
        --meta Metadata.tsv \
        --gtf species_gene_annotation.gtf \
        --fasta reference_genome.fa \
        --chromsize sizes.genome \
        --upstream 0 1500 \
        --downstream 100 15000

To make a cisTarget database for species other than human or mouse:

python custom_databases.py --prefix speciesX
    --fasta reference.fasta
    --output /path/to/save/
    --consensus_regions consensus_regions.bed
    --ortholog human_speciesX_ortholog.tsv
    --ref_species hg
    --atac atac.h5ad
    --bedpath /path/to/bedtools

To run label transfer:

Rscript label_transfer.R --rna scrna.anno.rds \
    --atac integrated.atac.rds \
    -m Metadata.tsv \
    --fragments fragments.tsv.gz \
    --gff species_gene_annotation.gtf \
    --name projectA \
    -o results_dir/

To integrate multiple ATAC datasets (by different libraries):

Rscript integrate.R -n name --list datasets_ids.txt --data_path /path/to/samples/

Example of a datasets_ids.txt:

sample_id_1
sample_id_2
sample_id_3
...

Example of a data path:

 |--/path/to/samples/
    |--sample_id_1
       |--Peak_matrix
          |--barcodes.tsv
          |--matrix.mtx
          |--peak.bed
    |--sample_id_2
       |--Peak_matrix
          |--barcodes.tsv
          |--matrix.mtx
          |--peak.bed

About

Multimodal integration of various types of sequencing data


Languages

Language:Python 85.3%Language:R 14.7%