manojkarthick / pqrs

Command line tool for inspecting Parquet files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Directory support in `head` / pipe support

Hoeze opened this issue · comments

Hi, we're very happily using pqrs now and found two small issues with it:

  1. head does not support directories:
#> pqhead data.parquet 
Error: ParquetError(General("underlying IO error: Is a directory (os error 21)"))
  1. It panics when used in a pipe:
#> pqcat data.parquet | head

###########################################################################################################################################################################################################
File: data.parquet/d66ac6554cc44c3cbfaa56b75fa446e4.parquet
###########################################################################################################################################################################################################

[...]
thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', library/std/src/io/stdio.rs:935:9

Hi! Thanks for filing the issue.

  1. That is the expected behaviour for pqrs head which works similar to the head command in *nix, which doesn't traverse directories. I am not sure if head-ing directories should be supported - selecting a file randomly makes it non-deterministic while sorting the files by a property (file name, last modified time, etc) will take a long time if there are lots of files.

  2. The second issue is because of an upstream bug in how sigpipes are handled in Rust. Ref: rust-lang/rust#46016