David Roher's repositories
etymology-db
An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship types.
diachronic
Get daily historical snapshots of every article on any Wiki, formatted as Parquet files
aoc-2018-sql
Advent of Code 2018 in SQL
arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
arrow-datafusion
Apache Arrow DataFusion and Ballista query engines
arrow-rs
Official Rust implementation of Apache Arrow
baseball.computer.rs
Rust parser for the baseball.computer database.
boxball-snippets
Queries run on the Boxball DB.
dbt-duckdb
dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
duckdb-web
DuckDB-Web - Source code of duckdb.org
retrosheet
Enhanced version of Retrosheet (http://www.retrosheet.org) data.
metricflow
MetricFlow allows you to define, build, and maintain metrics in code.