batermj / github_globe

A visualization of GitHub users throughout the globe

Home Page:http://aasen.in/github_globe/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Github Globe

This is a little visualization of Github users throughout the world. Check out an interactive version at http://aasen.in/github_globe

github users plotted on a globe

Creating the Visualization

All data is provided by GitHub Archive and fetched via Google BigQuery.

Locations are provided by approximately 1 million of the 4 million GitHub users. They are written in an informal syntax with varying specificity. For example, Seattle, Seattle, WA and United States are all valid.

The 1,000 most common locations are passed through the Google Geocoding API and transformed into geographical coordinates.

Data is then grouped by coordinates, so San Francisco, San Francisco, CA, and San Fran are combined.

Finally, data is plotted on the WebGL Globe.

Anatomy of the Repo

All queries are stored in master/fetch.

Code to transform the data is in master/transform.

The visualization and WebGL globe are stored in the gh-pages branch.

Problems

One problem is that locations vary in specificity. Many people report only their country and leave out states, cities, and other identifiers.

To solve this, I could calculate the specificity of an address and leave out any broad locations like China, America, and California. However, since this would skew the data I decided not to. If you disagree, fork it!

About

A visualization of GitHub users throughout the globe

http://aasen.in/github_globe/

License:GNU General Public License v3.0


Languages

Language:Python 100.0%