WalternativE / datasaurus-exploration

Uni exercise visualizing the datasaurus dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Datasaurus Exploration

This repository was created as a short exercise during a seminar in my uni's data science program. The conda environment used to work in the notebook is documented in the file environment.yml.

Takeaways

  • Datasaurus is a collection of datasets with similar mean and standard deviation while the actual data is very dissimilar
  • The datasets themselves show the dissimilarities very well when plotted in scatter plots
  • Holoviews can be used to declaratively build different plots to better analyze the data without falling into the same traps you would encounter when only looking at summary statistics

About

Uni exercise visualizing the datasaurus dataset


Languages

Language:Jupyter Notebook 100.0%