Datasaurus Exploration

This repository was created as a short exercise during a seminar in my uni's data science program. The conda environment used to work in the notebook is documented in the file environment.yml.

Takeaways

Datasaurus is a collection of datasets with similar mean and standard deviation while the actual data is very dissimilar
The datasets themselves show the dissimilarities very well when plotted in scatter plots
Holoviews can be used to declaratively build different plots to better analyze the data without falling into the same traps you would encounter when only looking at summary statistics

About

Uni exercise visualizing the datasaurus dataset

Languages

Language:Jupyter Notebook 100.0%