why do i get such a small output file for STARsolo?
gmnmnm opened this issue · comments
I ran STARsolo like the code below on google colab pro with a human kidney transplant rejection biopsy sample scrnaseq by 10X 5' chromium v3.1
The output I got was a 700mb bam file and barcodes.tsv, features.tsv, matrix.mtx only 1~5kb each. + the features file was empty except "missing features"
I was attempting to find DEGs by cell type with seurat afterwards, but this output i got isn't useful so is there any mistake i made while running starsolo or is there some way i can use the bam file for further analysis by single cell?
!wget https://github.com/alexdobin/STAR/archive/2.7.11b.tar.gz
!tar -xzf 2.7.11b.tar.gz
!cd STAR-2.7.11b
%%bash
cd /content/STAR-2.7.11b/source
make STAR
%%bash
sudo apt-get update
sudo apt-get install g++
sudo apt-get install make
!wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.fna.gz
!wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.gff.gz
%%bash
sudo apt install unzip
%%bash
gzip -d /content/GRCh38_latest_genomic.fna.gz
gzip -d /content/GRCh38_latest_genomic.gff.gz
%%bash
/content/STAR-2.7.11b/source/STAR
--runThreadN 12
--runMode genomeGenerate
--genomeDir /content/STAR-2.7.11b/genome
--genomeFastaFiles /content/GRCh38_latest_genomic.fna
--sjdbGTFfile /content/GRCh38_latest_genomic.gff
--sjdbOverhang 100
!wget "ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR100/ERR10040318/sc5rEXT217_hg19_S1_L001_R2_001.fastq.gz"
!wget "ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR100/ERR10040318/sc5rEXT217_hg19_S1_L001_R1_001.fastq.gz"
%%bash
tar -zxvf /content/cellranger-8.0.0.tar.gz
%%bash
gzip -d /content/cellranger-8.0.0/lib/python/cellranger/barcodes/3M-5pgex-jan-2023.txt.gz
%%bash
/content/STAR-2.7.11b/source/STAR
--runThreadN 4
--genomeDir /content/STAR-2.7.11b/genome
--readFilesIn /content/sc5rEXT217_hg19_S1_L001_R2_001.fastq.gz /content/sc5rEXT217_hg19_S1_L001_R1_001.fastq.gz
--readFilesCommand zcat
--soloType CB_UMI_Simple
--soloCBwhitelist /content/cellranger-8.0.0/lib/python/cellranger/barcodes/3M-5pgex-jan-2023.txt
--soloUMIlen 12
--outSAMtype BAM SortedByCoordinate
i have a similary question, now have you solved it