bschilder / ThreeWayTest

Summary statistics-based association test for identifying the pleiotropic effects with set of genetic variants

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimise data compression

bschilder opened this issue · comments

https://github.com/bschilder/ThreeWayTest/actions/runs/4313767916/jobs/7526122915#step:4:5962

checking LazyData ... WARNING
    LazyData DB of 7.8 MB without LazyDataCompression set
    See §1.1.6 of 'Writing R Extensions'checking data for ASCII and uncompressed saves ... WARNING
    
    Note: significantly better compression could be obtained
          by using R CMD build --resave-data
                               old_size new_size compress
    covariance_matrix_data.rda    1.3Mb    896Kb       xz
    data_matrix_final.rda         6.2Mb    3.9Mb       xz
    gene_length_list.rda           16Kb     11Kb       xz
    gene_list.rda                  73Kb     54Kb    bzip2
    selected_genotype.rda          14Kb      5Kb       xz

The latter warning is fixed by running the following (solution found here):

f=list.files('data', full.names = T)
tools::resaveRdaFiles(f)

https://github.com/bschilder/ThreeWayTest/actions/runs/4313767916/jobs/7526122915#step:4:5962

checking LazyData ... WARNING
    LazyData DB of 7.8 MB without LazyDataCompression set
    See §1.1.6 of 'Writing R Extensions'

This is resolve by setting the LazyDataCompression field in the DESCRIPTION file.
Solution found here.

I chose gzip, though it's possible other formats are more efficient:

LazyDataCompression: gzip

With this, ThreeWayTest now passses checks with only a note about the size (which is ok for now):
Screenshot 2023-03-02 at 13 16 33