Matthew Powers's repositories

quinn

pyspark methods to enhance developer productivity 📣 👯 🎉

chispa

PySpark test helper methods with beautiful error messages

Language:PythonLicense:MITStargazers:508Issues:7Issues:48

spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

Language:ScalaLicense:MITStargazers:418Issues:15Issues:66

mack

Delta Lake helper methods in PySpark

Language:PythonLicense:MITStargazers:269Issues:14Issues:68

spark-style-guide

Spark style guide

Language:Jupyter NotebookStargazers:249Issues:18Issues:6

jodie

Delta lake and filesystem helper methods

Language:ScalaLicense:MITStargazers:44Issues:11Issues:46

farsante

Fake Pandas / PySpark DataFrame creator

ceja

PySpark phonetic and string matching algorithms

Language:PythonLicense:MITStargazers:32Issues:3Issues:6

levi

Delta Lake helper methods. No Spark dependency.

Language:PythonLicense:MITStargazers:15Issues:4Issues:17

python-parquet-examples

Using the Parquet file format with Python

Language:PythonStargazers:13Issues:3Issues:0

eren

PySpark Hive helper methods

deltadask

Delta Lake powered by Dask

Language:Jupyter NotebookStargazers:5Issues:2Issues:0

mrpowers-benchmarks

MrPowers benchmarks for Dask, Polars, DataFusion, and pandas

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:5Issues:3Issues:4

mesita

Print colorful tables with nice diffs in the Terminal

data-scrapbook

A collection of images and captions to explain core data concepts

delta-rs

A native Rust library for Delta Lake, with bindings into Python

Language:RustLicense:Apache-2.0Stargazers:2Issues:1Issues:0

mrpowers-book

Book on MrPowers OSS projects, blogs, and other assets

mrpowers.github.io

Documentation and stuff

Language:HTMLStargazers:1Issues:3Issues:0

pydata-examples

Examples of various PyData technologies like pandas, DataFusion, DuckDB, and Polars

Language:Jupyter NotebookStargazers:1Issues:3Issues:0

pyspark-examples

PySpark example notebooks

Language:Jupyter NotebookStargazers:1Issues:2Issues:0

sparkprof

Pretty Spark docs

arrow-datafusion

Apache Arrow DataFusion SQL Query Engine

Language:RustLicense:Apache-2.0Stargazers:0Issues:1Issues:0

arrow-datafusion-python

Apache Arrow DataFusion Python Bindings

Language:RustLicense:Apache-2.0Stargazers:0Issues:1Issues:0

db-benchmark

reproducible benchmark of database-like ops

Language:RLicense:MPL-2.0Stargazers:0Issues:1Issues:0

deltaray

Delta reader for the Ray open-source toolkit for building ML applications

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0

lance-examples

Examples with Lance table format

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

spark

Mirror of Apache Spark

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:3Issues:0

spark-website

Apache Spark Website

License:Apache-2.0Stargazers:0Issues:1Issues:0