This repository was created as a short exercise during a seminar in my uni's data science program.
The conda environment used to work in the notebook is documented in the file environment.yml
.
- Datasaurus is a collection of datasets with similar mean and standard deviation while the actual data is very dissimilar
- The datasets themselves show the dissimilarities very well when plotted in scatter plots
- Holoviews can be used to declaratively build different plots to better analyze the data without falling into the same traps you would encounter when only looking at summary statistics