CLI tool for analyzing CSV data
To install the stats_cli package as an executable use:
go install github.com/Meowcenary/stats_cli@latest
After installing commands can be run with:
stats_cli <command> [options]
To clone the stats_cli package use:
git clone https://github.com/Meowcenarystats_benchmark.git
Once cloned commands can be run from the root directory:
go run main.go <command> [options]
To run all the tests use go test ./...
from the root directory of the project.
To run a single package's tests use go test ./<package name>
E.g
go test ./analysis
To run the benchmark use go test -bench=.
. Pass the flag -count
to run the
benchmark repeatedly. For counts over 700 the flag -timeout
will need to be
passed or the benchmark will quit before finishing all runs. Here is an example
of a full benchmarking command for 1000 runs:
go test -bench=. -count=1000 -timeout 20m
Currently the only command available is summary
which reads a csv file of
decimal and integer numbers from the path provided by the argument --file
.
File is the only requird argument, the rest are optional.
All available arguments for stats_cli summary
:
--file
- CSV file to read data from--columns
- Columns to include from CSV for summary output--stats
- Stats to include for summary output
Example usage:
stats_cli summary --file testdata.csv --columns age,income --stats max,min,mean
Or from the project root with
go run main.go summary --file testdata.csv --columns age,income --stats max,min,mean
Details for running the Golang benchmark can be found above. The average benchmark time over 100 runs was 0.09062145 ns.
The Python script was benchmarked using the library timeit
. The average
benchmark time over 100 runs was 10.194 ns.
The R script was benchmarked using a simple system time comparison. The average benchmark time over 100 runs was 73757629 ns (0.073757629 s).