ShrykeWindgrace / treestats

Command-line tool to produce statistics about a directory tree

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

treestats License

treestats is a simple tool to compute the count and the total size of each file extension in a directory tree.

Usage

Usage: treestats analyze <directory>
       treestats -h | --help
       treestats --version

Commands:
    analyze                  Analyze a directory tree and output statistics.

Arguments:
    directory                Root directory of the directory tree to analyze.

Options:
    -h, --help               Show this screen.
    --version                Show version.

Output

The output consists of TSV with no header and three columns: extension, count, total size (in bytes), separated by single tabs (\t). Files that do not have an extension, or the extension of which is not valid UTF-8 are grouped under the “other” pseudo-extension.

This enables seamless composition with other tools to filter, sort, pretty-print… the output.

Examples

Top 3 extensions sorted by decreasing total size

$ treestats analyze ~/Music | sort -nrk3 | head -n 3
  5654 files, 25.19 GiB analyzed [00:00:02]
  ogg	1922	15536178965
  mp3	2999	10386533620
  wma	85	394384068

Top 3 extensions sorted by decreasing file count, pretty-printed in columns

$ treestats analyze ~/Music | sort -nrk2 | head -n 3 | column -t
  5654 files, 25.19 GiB analyzed [00:00:02]
  mp3  2999  10386533620
  ogg  1922  15536178965
  au   128   97610120

Same thing with csview:

$ treestats analyze ~/Music | sort -nrk2 | head -n 3 | csview -H -t -s Rounded
  5654 files, 25.19 GiB analyzed [00:00:02]
╭─────┬──────┬─────────────╮
│ mp3 │ 2999 │ 10386533620 │
│ ogg │ 1922 │ 15536178965 │
│ au  │ 128  │ 97610120    │
╰─────┴──────┴─────────────╯

Top 3 extensions with more than 10 files sorted by increasing file count

$ treestats analyze ~/Music | awk '$2 > 10' | sort -nk2 | head -n 3
  5654 files, 25.19 GiB analyzed [00:00:02]
  xm	11	12172128
  mix	12	31213
  rm	19	5305945

Manipulating data on Windows

We can use cmdlets provided by powershell/pwsh, though we need to specify delimiter and provide header. In this this example we output top 3 extensions sorted by decreasing total size. Pretty-printing is handled by powershell.

PS> treestats.exe analyze $HOME\Downloads\ | ConvertFrom-Csv -Delimiter `t -Header 'extension', 'count', 'size' | sort {[int]$_.size} -Descending -top 3
  4893 files, 985.43 MiB analyzed [00:00:00]
extension count size
--------- ----- ----
exe       21    257707977
zip       19    158416662
pdf       128   146198528

Compiling

Run cargo build --release in your working copy.

Contributing and reporting bugs

Contributions are welcome through GitHub pull requests.

Please report bugs and feature requests on GitHub issues.

License

treestats is copyright (C) 2022 Jérémie Roquet jroquet@arkanosis.net and licensed under the ISC license.

About

Command-line tool to produce statistics about a directory tree

License:ISC License


Languages

Language:Rust 100.0%