chi-0828 / RNA-Abundance-Quantification-on-UPMEM

Running state-of-the-art RNA-seq abundance quantification software "kallisto" on UPMEM DPU system

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RNA-Abundance-Quantification-on-UPMEM

  • More details can be found in the paper: "RNA-seq Quantification on Processing in Memory Architecture: Observation and Characterization"
  • paper

Our new project about RNA-seq Quantification on UPMEM DPU

UpPipe

Cite the paper if you use D_kallisto in your work

Liang-Chi Chen, Shu-Qi Yu, Chien-Chung Ho, Yuan-Hao Chang, Da-Wei Chang, Wei-Chen Wang, Yu-Ming Chang, "RNA-seq Quantification on Processing in memory Architecture: Observation and Characterization," The 11th IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA), August 23-25, 2022

@inproceedings{chen2022rna,
  title={RNA-seq Quantification on Processing in memory Architecture: Observation and Characterization},
  author={Chen, Liang-Chi and Yu, Shu-Qi and Ho, Chien-Chung and Chang, Yuan-Hao and Chang, Da-Wei and Wang, Wei-Chen and Chang, Yu-Ming},
  booktitle={2022 IEEE 11th Non-Volatile Memory Systems and Applications Symposium (NVMSA)},
  pages={26--32},
  year={2022},
  organization={IEEE}
}

Build

// build htslib first
cd ext/htslib
autoheader
autoconf
make -j16
// build our main program
cd ../..
mkdir obj
cd src 
make -j16

Usage

./D_kallisto pseudo [fastq file] 
      -i [index file] 
      -o [output path] 
      -t [num of CPU threads] 
      -d [num DPUs]
      --single
      -l [double]
      -s [double]

E.g., testing 100K reads/11-mer by 64*8 dpus

time ./D_kallisto pseudo -i ~/data/experiment/11-mer.idx -o out --single ~/data/experiment/RNA_read/100K.fastq -l 150 -s 30 -t 8 -d 64

More information

DPU program is in src/dpu_app
DPU allocation and CPU-DPU(DPU-CPU) transfers are in src/ProcessReads.cpp

Reference

kallisto

https://github.com/pachterlab/kallisto

UPMEM

https://github.com/CMU-SAFARI/prim-benchmarks
https://sdk.upmem.com/2021.3.0/
https://sdk.upmem.com/2021.3.0/CppAPI/index.html

Testing CPU-based kallisto

time ./kallisto pseudo -i ~/data/experiment/11-mer.idx -o out --single ~/data/experiment/RNA_read/100K.fastq -l 150 -s 30 -t 8 

About

Running state-of-the-art RNA-seq abundance quantification software "kallisto" on UPMEM DPU system

License:MIT License


Languages

Language:C 78.7%Language:C++ 16.5%Language:Perl 1.5%Language:Makefile 1.4%Language:Roff 0.9%Language:M4 0.6%Language:Shell 0.3%Language:Scilab 0.1%