mdshw5 / fastqp

Simple FASTQ quality assessment using Python

Home Page:https://pypi.python.org/pypi/fastqp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make pivoted text table by default

mdshw5 opened this issue · comments

@mdjones prefers tab-separated values.

  • store metrics aggregated in a dictionary
  • read pandas data frame from dictionary
  • pivot dataframe on second column factors

@mdjones pivoted tables are only slightly more readable (maybe less) than the "melted" table:

>>> df.pivot('index', 'columns', 'values')
columns   100_q05  100_q25  100_q50  100_q75  100_q95  10_q05  10_q25  10_q50  \
index                                                                           
UHR1_1_1        2       30       34       35       35      32      37      39   
UHR1_1_2        2       25       33       35       35      30      37      39   

columns   10_q75  10_q95    ...      read_gc_91  read_gc_92  read_gc_93  \
index                       ...                                           
UHR1_1_1      39      39    ...              40          23          20   
UHR1_1_2      39      39    ...              52          44          29   

columns   read_gc_94  read_gc_95  read_gc_96  read_gc_97  read_gc_98  \
index                                                                  
UHR1_1_1          14           8           6           2           2   
UHR1_1_2          25          14           8          12           2   

columns   read_gc_99      reads  
index                            
UHR1_1_1           1  103870017  
UHR1_1_2           2  103870017  

[2 rows x 4228 columns]

Making these tables more readable would require splitting the column names into key/value pairs and faceting sub-tables like:

read_gc    counts
91             40
92             23
93             20
94             14
...

I don't actually remember the context for this request.