kylebaron / data-2024

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apache Arrow Based Workflows for Large Data Analytics

Slides are availble in pdf format

arrow.qmd runs simulations with mrgsolve and processes outputs with Apache Arrow

See the Makefile for tasks

  • make taxi calls script/taxi.R to download and save tiny NYC taxi data

  • make read-taxi runs read benchmarks for NYC taxi data via script/read-taxi.R

  • make write-taxi runs write benchmarks for NYC taxi data via script/write-taxi.R

  • make size compares file sizes via script/size.R

About


Languages

Language:R 99.0%Language:Makefile 1.0%