Takeshi Yamamuro's repositories
spark-tpcds-datagen
All the things about TPC-DS in Apache Spark
spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
spark-sql-server
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
datasketches-spark
Data Sketches for Apache Spark
spark-data-repair-plugin
Provide functionality to build statistical models to repair dirty tabular data in Spark
spark-query-log-plugin
A simple toolkit to analyze Spark query logs
fuzz-testing-for-spark
[WIP] Run SQL-aware fuzz tests for the Catalyst optimizer in Apache Spark
spark-graphx-pregel-personalized-pagerank
Personalized PageRank on Pregel/GraphX
mlflow-example
An example code for MLflow
spark-executor-dict-plugin
Fast Read-only Data Dictionary Attached to Each Spark Executor
jupyterlab-dockerfile
A docker file for JupyterLab including pyspark
jvmci-test
A toy box to test JVMCI in JDK11
equipartitioning-example
Equipartitioning in Spark
lstm-crf-pytorch
LSTM-CRF in PyTorch
pg_stats_exporter
A PostgreSQL metrics exporter for Prometheus.
pgvector
Open-source vector similarity search for Postgres
pydeps-neo4j
Exports Python package dependencies into Neo4j
rag-postgres
A trial place for RAG with PostgreSQL resources
spark-tpcds-sf-1
TPC-DS queries with 1GB scale factor
spark-website
Mirror of Apache Spark Website