harelba / q

q - Run SQL directly on delimited files and multi-file sqlite databases

Home Page:http://harelba.github.io/q/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

format option does not work

greymd opened this issue · comments

Hi,
The format option does not work correctly at the latest version 2.0.19.

As documented here,
https://github.com/harelba/q/blob/master/examples/EXAMPLES.markdown

Now we'll see how we can format the output itself, so it looks better:
q -f "2=%4.2f" "SELECT c6,SUM(c5)/1024.0 AS size FROM exampledatafile GROUP BY c6 ORDER BY size DESC LIMIT 5"

But the actual behavior is like below.

root@0cfd72164b2f:~# uname -a
Linux 0cfd72164b2f 5.4.0-1049-aws #51~18.04.1-Ubuntu SMP Fri May 14 18:38:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

root@0cfd72164b2f:/work# q --version
q version 2.0.19
Python: 3.7.1 (default, Jun 12 2019, 01:22:06) // [GCC 5.4.0 20160609]
Copyright (C) 2012-2020 Harel Ben-Attia (harelba@gmail.com, @harelba on twitter)
http://harelba.github.io/q/

root@0cfd72164b2f:~/q/examples# q -f "2=%4.2f" "SELECT c6,SUM(c5)/1024.0 AS size FROM exampledatafile GROUP BY c6 ORDER BY size DESC LIMIT 5"
2011-10-12 %4.2f
2012-01-28 %4.2f
2011-12-18 %4.2f
2011-10-04 %4.2f
2011-12-19 %4.2f

thanks for letting me know @greymd ! i'll take a look at it.

hi, sorry for the late reply, too many things in life in parallel :)

This is indeed a bug that started when moving to python 3, since this feature doesn't abstract the python implementation details properly.

I'll fix it soon after a new release is out.

In the mean time, if you need to use this feature, you can change the formatting to python3 standard:

$ bin/q.py -f "2={:4.2f}" "SELECT c6,SUM(c5)/1024.0 AS size FROM examples/exampledatafile GROUP BY c6 ORDER BY size DESC LIMIT 5"
2011-10-12 372.35
2012-01-28 209.97
2011-12-18 70.84
2011-10-04 61.56
2011-12-19 24.00
$ q -v
q version 3.1.6
Python: 3.9.2 (default, Feb 28 2021, 17:03:44) // [GCC 10.2.1 20210110]

Thanks for this workaround.

(a) But I do not know how to format two columns. The syntax of the -f option as documented is that I can format two columns by separating the formats by a comma. This doesn't work, not for the workaround, not if done like this.

-f '2={:4.2f},3={:7.2f}'

Only the first column gets formatted. Should I be doing this another way? Is there another way?

(b) And where is the Python format syntax documented. I've looked. I know about the printf % style formatting but what does the ":" mean? I've had a look and a URL would be useful.

The format of the parameter of -f is as follows:

-f <C>=<F>,<C>=<F>,...

Where C is the output column number, and F is a python 3 format.

Examples:

  • -f 1={:4.3f} would cause the first output column to have a format of 4.3f, which means to round and show 3 digits after the decimal point.
  • -f 1={:>20},2={:4.3f} would cause the first output column to have a format of >20 (padding to the right, 20 characters-wide), and the second output column to have a format of 4.3f.

Here's an example query (input is generated using linux's seq command, which just generates a list of numbers):

$ seq 1 10 | q -f '1={:>20},2={:7.5f}' -t "select c1,c1 from -"
                   1	1.00000
                   2	2.00000
                   3	3.00000
                   4	4.00000
                   5	5.00000
                   6	6.00000
                   7	7.00000
                   8	8.00000
                   9	9.00000
                  10	10.00000
$

The official python3 formatting docs are in here

Btw, can you provide examples of use-cases where you'd want formatting? This feature is very rarely used by users, so I haven't invested a lot in it relative to other parts of the product.