alexdobin / STAR

RNA-seq aligner

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why do i get such a small output file for STARsolo?

gmnmnm opened this issue · comments

I ran STARsolo like the code below on google colab pro with a human kidney transplant rejection biopsy sample scrnaseq by 10X 5' chromium v3.1
The output I got was a 700mb bam file and barcodes.tsv, features.tsv, matrix.mtx only 1~5kb each. + the features file was empty except "missing features"

I was attempting to find DEGs by cell type with seurat afterwards, but this output i got isn't useful so is there any mistake i made while running starsolo or is there some way i can use the bam file for further analysis by single cell?

!wget https://github.com/alexdobin/STAR/archive/2.7.11b.tar.gz
!tar -xzf 2.7.11b.tar.gz
!cd STAR-2.7.11b

%%bash
cd /content/STAR-2.7.11b/source
make STAR

%%bash
sudo apt-get update
sudo apt-get install g++
sudo apt-get install make

!wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.fna.gz
!wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.gff.gz

%%bash
sudo apt install unzip

%%bash
gzip -d /content/GRCh38_latest_genomic.fna.gz
gzip -d /content/GRCh38_latest_genomic.gff.gz

%%bash
/content/STAR-2.7.11b/source/STAR
--runThreadN 12
--runMode genomeGenerate
--genomeDir /content/STAR-2.7.11b/genome
--genomeFastaFiles /content/GRCh38_latest_genomic.fna
--sjdbGTFfile /content/GRCh38_latest_genomic.gff
--sjdbOverhang 100

!wget "ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR100/ERR10040318/sc5rEXT217_hg19_S1_L001_R2_001.fastq.gz"
!wget "ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR100/ERR10040318/sc5rEXT217_hg19_S1_L001_R1_001.fastq.gz"

%%bash
curl -o cellranger-8.0.0.tar.gz "https://cf.10xgenomics.com/releases/cell-exp/cellranger-8.0.0.tar.gz?Expires=1713989254&Key-Pair-Id=APKAI7S6A5RYOXBWRPDA&Signature=ITjFA8SQmAmUz1yrPO4NPWwfF-fCEI72f8uyf2UCJpxk~dsQsVZ9YXcc42aIIDY9jNo4LgAHMwuejpS7ZNOX0581sHfSHR4zEanVL1L38DtzCkOVd~F83VIZyrZm-qh7toMS4Fe9GbA4YWbmVVX9sRkHhzuWZuXKNrCyGFbhwsaPYDc9reWKu9dZ4HRbodoGSd9BTilOR13SMbwjgRHdJNJsvfHgCV2Px76bW8LP~wcpEsac51mdCOsonGGc-cdRg1dcs91bQjANIA-32eBxOArH4~-l33Cbx7RqG-nCvsgbFSgYOATRzQJeDjRi5-doHAZLQ-0B-4e9AeMOMYxD8A__"

%%bash
tar -zxvf /content/cellranger-8.0.0.tar.gz

%%bash
gzip -d /content/cellranger-8.0.0/lib/python/cellranger/barcodes/3M-5pgex-jan-2023.txt.gz

%%bash
/content/STAR-2.7.11b/source/STAR
--runThreadN 4
--genomeDir /content/STAR-2.7.11b/genome
--readFilesIn /content/sc5rEXT217_hg19_S1_L001_R2_001.fastq.gz /content/sc5rEXT217_hg19_S1_L001_R1_001.fastq.gz
--readFilesCommand zcat
--soloType CB_UMI_Simple
--soloCBwhitelist /content/cellranger-8.0.0/lib/python/cellranger/barcodes/3M-5pgex-jan-2023.txt
--soloUMIlen 12
--outSAMtype BAM SortedByCoordinate

i have a similary question, now have you solved it