WIP - This is not suitable for production use.
Improved visualizations in Spark.
Supports PySpark >= 2.2.0.
-
Clone this repo.
-
Use
pip
to install:
pip install -e .
We follow the contributing guidelines of the nteract project.
pandas documentation
- pandas DataFrame
PySpark API documentation
-
pyspark DataFrame
-
pyspark.RDD A Resilient Distributed Dataset (RDD), the basic abstraction in Spark)
-
pyspark.StorageLevel controls storage of an RDD