38 / d4-format

The D4 Quantitative Data Format

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Auto-detect sparse (k=0) format.

arq5x opened this issue · comments

Currently, for RNA-seq, ChIP-seq, etc., one must use -S when running create from a bigWig file. This should be auto-detected during creation.

For example, consider this Chip-seq bigwig file from ENCODE:

wget https://www.encodeproject.org/files/ENCFF405ZDL/@@download/ENCFF405ZDL.bigWig
mv ENCFF405ZDL.bigWig ENCFF405ZDL.bw
ls -h ENCFF405ZDL.bw
1.0G Dec 10  2020 ENCFF405ZDL.bw

d4utils create ENCFF405ZDL.bw ENCFF405ZDL.bigWig.d4
ls -lh ENCFF405ZDL.bigWig.d4
2.2G Aug 18 05:59 ENCFF405ZDL.bigWig.d4

However, whereas the D4 file is 2.2Gb (versus 1.0 Gb for the original bigwig) without -S, the result is 58Mb when using -S:

d4utils create -S ENCFF405ZDL.bw ENCFF405ZDL.bigWig.d4
ls -lh ENCFF405ZDL.bigWig.d4
58M Aug 18 06:05 ENCFF405ZDL.bigWig.d4