chi-0828 / UpPipe

UpPipe is an RNA abundance quantification design on a real processing-near-memory system (UPMEM DPU); the paper of this project is published in Design Automation Conference (DAC) 2023

Home Page:https://doi.org/10.1109/DAC56929.2023.10247915

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UpPipe

GitHub repository GitHub top language GitHub commit activity (branch) GitHub last commit (by committer) C++ version g++ version
UpPipe is an RNA abundance quantification design on a real processing-near-memory system (UPMEM DPU); the paper of this project is published in Design Automation Conference (DAC) 2023

Citation

Liang-Chi Chen, Chien-Chung Ho, and Yuan-Hao Chang, “UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification," ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, July 9-13, 2023.

@inproceedings{chen2023uppipe,
  title={UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification},
  author={Chen, Liang-Chi and Ho, Chien-Chung and Chang, Yuan-Hao},
  booktitle={2023 60th ACM/IEEE Design Automation Conference (DAC)},
  pages={1--6},
  year={2023},
  organization={IEEE}
}

Materials

Hardware/System Prerequisites

The project has to be run on a system equipped with UPMEM DRAM Processing Units (DPUs), and the kernel system requires installing the UPMEM SDK

Start

git clone https://github.com/chi-0828/UpPipe.git
cd UpPipe
chmod +x build.sh
./build.sh
make -j4

Usage

Allocate transcriptome to DPU(s)

  • KMER SIZE should be 3, 5, ..., 31
  • NUMBER OF DPU(s) in a PIPELINE WORKER should be less than 64 in our suggestion
./UpPipe build \
            -k KMER SIZE  \
            -i OUTPUT INDEX FILE PATH \
            -d NUMBER OF DPU(s) in a PIPELINE WORKER \
            -f TRANSCRIPTOME FILE PATH

Run alignment step for quantification

  • The size of k-mer is already set in INPUT INDEX FILE, this setting cannot be changed in this step
./UpPipe alignment \
            -i INPUT INDEX FILE PATH \
            -r NUMBER OF PIPELINE WORKER(s) \
            -f INPUT RNA READ FILE PATH

Parameters setting (dpu_app/dpu_def.h)

  • KMER SIZE less than 7 may lead to inaccurate mapping result
  • NUMBER OF DPU(s) in a PIPELINE WORKER should be less than 64 for optimal performance
  • The number of transcript / NUMBER OF DPU(s) in a PIPELINE WORKER must be less than 200 (COUNT_LEN in dpu_app/dpu_def.h)
  • Setting READ_LEN to the sequence length of RNA reads
  • Setting WRAM_READ_LEN to the a number which is larger than READ_LEN and divisible by 8
  • WRAM_PREFETCH_SIZE is the size for WRAM pre-feteching, 16 is the optimal size in most situations

Test

  • To build the index file by 11-mer and allocate to 60 DPUs
./UpPipe build \
            -k 11  \
            -i test/test.idx \
            -d 60 \
            -f test/tran.fa
  • To run alignment with 40 pipeline workers
./UpPipe alignment \
            -i test/test.idx \
            -r 40 \
            -f test/read.fa
  • Performance: UpPiep uses 40 pipeline workers
real    0m2.747s
  • Performance: UpPiep uses 20 pipeline workers
real    0m3.584s
real    0m4.003s
  • To note that UpPipe shows its efficiency more in the large size dataset due to the porcessing-in-memory features

About

UpPipe is an RNA abundance quantification design on a real processing-near-memory system (UPMEM DPU); the paper of this project is published in Design Automation Conference (DAC) 2023

https://doi.org/10.1109/DAC56929.2023.10247915


Languages

Language:C 91.2%Language:C++ 3.2%Language:Perl 1.7%Language:Makefile 1.6%Language:Roff 1.0%Language:M4 0.7%Language:Shell 0.4%Language:Scilab 0.1%