Repository to accompany "Pandas for Everyone"
The easiest way to get everything you need to the tutorial is to install anaconda
You can download and install it here: https://www.continuum.io/downloads
To download just the data, see the Data section below. Otherwise you can choose to clone this repository, or click the "Clone or Download" link above and clikcing Download Zip
conda install seaborn
There is an error in the preface of the book for installing packages. I am leaving this section here in the README to have an updated list of packages and installation instructions
You can choose to create a virtual envirionment for the packages used in the book, so it doesn't clash with other packages you plan to use later on.
# create a virtual environment named "book" using python 3.6
conda create -n book python=3.6
# activate the environment
# so all installed packages will go in there and not mess up your base python environment
source activate book
Whether you decited to create a virtual environment or not, you can install the packages with the below commands.
If you did use virtual environments, remember to source activate book
before you follow along with the book
so the packages you installed can be loaded.
conda install pandas xlwt openpyxl seaborn numpy ipython jupyter statsmodels scikit-learn regex wget odo numba
conda install -c conda-forge pweave # you don't really need this package, it was used to build and create the book
conda install -c conda-forge feather-format
pip install lifelines pandas-datareader
You can choose to just download the datasets by using Minhas Kamal's DownGit by clicking the link here
Ongoing list of data references:
- Gapminder: https://github.com/jennybc/gapminder/raw/master/inst/gapminder.tsv
- Survey: Comes from the Software-Carpentry SQL lesson
- Ebola: www.github.com/cmrivers/ebola