fisher-lebo / nomcom

combinations of names

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nomcom

NOTICE: This repo has moved to github.com/aaron-lebo/nomcom.

combinations of names of notable individuals, such as:

Please see combos.txt for full results.

On Reddit there is a user with the username AlGoreVidalSassoon. This is the combination of Al Gore + Gore Vidal + Vidal Sassoon, which struck me as terribly clever so I wanted to have some way to automate the combining of names in this way.

After some Googling, I struggled to find anything relevant, which suprised me somewhat as I think this is a fun puzzle. I had some idea of the approach I needed to do it myself, but then struggled to find a database of historical names anywhere. Wikipedia to the rescue. One of Wikipedia's many lists of lists include notable individuals grouped by nationality. The first part of the script scrapes a number of these individual pages. The current database generated from this includes people from Britain as well as Americans from each individual state. These were chosen in particular to both limit the amount of data that needed to be pulled as well as to keep things relatively simple with Anglicized names.

The next part attempts to match the names. Each name is converted to first name + last name, so Ralph Waldo Emerson becomes Ralph Emerson. This does not keep perfectly with the historical record, but it makes sure that in each combination of three names, each section is the name of an individual. For example, if Woodrow T Wilson was used, one of the combinations could be Woodrow T Wilson Chandler Bing. Woodrow T and T Wilson aren't the names of individuals we are aiming for. Breaking each name down to first name + last name keeps a nice symmetry.

The matching algorithim is naive but works. Currently it combines 3 names, but these matches could be as small or large as desired. The initial input includes 28,364 names and the final result is 37,166 combinations.

Thanks to Matthew Martin of the Linux User's Group @ UT Dallas for algorithm help.

About

combinations of names

License:MIT License


Languages

Language:Python 100.0%