Baby_names_US_IPython

A set of IPython notebooks to help analyze the US Social Security Administration baby names database

By David Taylor, www.prooffreader.com
Uses data from United States Social Security Administration; with some minor adjustments, it should be easily convertible to use other data sets.
Uses IPython notebooks to facilitate interactive exploration
Published on Github in case it can be useful to anyone who wishes to explore this or a similar dataset.
Feedback, positive or negative, is welcome.

Explanation of notebooks with links to nbviewer versions:

download_and_process.ipynb Downloads and makes pandas dataframe from US SSA baby names data
charting.ipynb Several cells to make several types of charts from the data
last_letter_boys_names.ipynb Script to make histograms of last letters in boys' names to show the incredible rise of '-n'. Blog post about it became wildly popular.
letter_freqs_and_bigrams.ipynb In progress, to find frequency of letters and letter pairs in baby names
singletons.ipynb In progress, to find names that only appear in the database once.
trendiness.ipynb To determine trendiest names (with an popular, not mathematic definition of trendy) using an analytical chemistry data analysis technique. The blog post about it was insanely popular, which I did not at all predict!

note: download_and_process.py is simply the .py export of download_and_process.ipynb, so it can be called by other notebooks to load the data.

About

A set of IPython notebooks to help analyze the US Social Security Administration baby names database

Language:Python 100.0%