brownplt / B2T2

The Brown Benchmark for Table Types (B2T2)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Investigate two more systems: data.table (R) and SAS

bennn opened this issue · comments

From @ramanshah :

data.table in R is an exemplary system for analysis of tabular data. While cryptic to a beginner, it is far more concise and expressive because it commits to a real semantics rather than an ever-growing passel of functions to do everyday data analysis. It's also the best in performance, with a multithreaded query optimizer (as I understand) making splendid use of one's hardware on analytic workloads, with this parallelism beautifully abstracted away from the user. I think it's amazing, but when I'm on a collaborative R project and try to use it, I mostly get outvoted by Tidyverse users. Sigh. The data.table library's concept of a table feels like one of the closest to your own definition, so I wonder how B2T2 could shed light on my subjective enjoyment of it.

https://rdatatable.gitlab.io/data.table/

SAS is another quite popular data language in the wild. It has a startlingly different syntax and semantics and might be interesting vis-a-vis B2T2. I had a boss in finance (whose taste I respect) who loved SAS. But it's hard to play with because it's proprietary (and extremely expensive if you're a private sector user), but maybe it would be stimulating.

https://en.wikipedia.org/wiki/SAS_language

We may want to extend B2T2 to illustrate problems that these languages solve.

  • Apache Arrow