Textualize / rich-cli

Rich-cli is a command line toolbox for fancy output in the terminal

Home Page:https://www.textualize.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistent header treatment for csv tables

AndydeCleyre opened this issue · comments

Hello!

I'm sorry I'm not sure exactly what's going on here, so I'll get to it. Using Zsh:

$ rows=( Package,Version,Latest,Project 'tomli,2.0.0,2.0.1,~/Code/zpy' 'click,8.0.1,8.0.3,~/Code/archbuilder_iosevka' 'pep517,0.11.0,0.12.0,~/Code/archbuilder_iosevka' 'ruamel.yaml,0.17.17,0.17.21,~/Code/archbuilder_iosevka' 'tomli,1.2.1,2.0.1,~/Code/archbuilder_iosevka' )
$ rich --csv - <<<${(F)rows}

image

$ rows=( 'Package,Version,Latest,Project' 'tomli,2.0.0,2.0.1,~/Code/zpy' 'click,8.0.1,8.0.3,~/Code/archbuilder_iosevka' 'pep517,0.11.0,0.12.0,~/Code/archbuilder_iosevka' 'ruamel.yaml,0.17.17,0.17.21,~/Code/archbuilder_iosevka' 'tomli,1.2.1,2.0.1,~/Code/archbuilder_iosevka' )
$ rich --csv - <<<${(F)rows}

Same result as above

$ rows=( 'tomli,2.0.0,2.0.1,~/Code/zpy' 'click,8.0.1,8.0.3,~/Code/archbuilder_iosevka' 'pep517,0.11.0,0.12.0,~/Code/archbuilder_iosevka' 'ruamel.yaml,0.17.17,0.17.21,~/Code/archbuilder_iosevka' 'tomli,1.2.1,2.0.1,~/Code/archbuilder_iosevka' )
$ rich --csv - <<<${(F)rows}

image

What determines whether the first row gets treated as a header?

Thanks for any help!

It’s a heuristic used by the Python CSV library, which is imperfect as you have noticed. In the future I’ll expose a way to adjust the via an option.

Thanks! Do you know what about the input in this case gives CSV the wrong idea, so that I can work around this?

Not sure. You could have a look at the source of the csv module.

FYI:

  • csv.Sniffer.has_header:
      def has_header(self, sample):
          # Creates a dictionary of types of data in each column. If any
          # column is of a single type (say, integers), *except* for the first
          # row, then the first row is presumed to be labels. If the type
          # can't be determined, it is assumed to be a string in which case
          # the length of the string is the determining factor: if all of the
          # rows except for the first are the same length, it's a header.
          # Finally, a 'vote' is taken at the end for each column, adding or
          # subtracting from the likelihood of the first row being a header.