squiroga8 / dsc-node-centrality-lab-online-ds-sp-000

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Network Dynamics: Node Centrality - Lab

Introduction

In this lab, you'll get a chance to practice implementing and interpreting the centrality metrics from the previous section. You'll do this be investigating the social network from Game of Thrones!

Objectives

You will be able to:

  • Understand and explain network centrality and its importance in graph analysis
  • Understand and calculate Degree, Closeness, Betweenness and Eigenvector centrality measures
  • Describe the use case for several centrality measures

Character Interaction Graph Data

A. J. Beveridge, and J. Shan created a network from George R. Martin's "A song of ice and fire" by extracting relationships between characters of the story. The dataset is available at Github. Relationships between characters were formed every time a character's name appears within 15 words of another character. This was designed as an approximate metric for character's interactions with each other. The results of this simple analysis are quite profound and produce interesting visuals such as this graph:

With that, it's your turn to start investigating the most central characters!

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('darkgrid')
import warnings
warnings.filterwarnings('ignore')

%matplotlib inline

Load the dataset

Start by loading the dataset as a pandas DataFrame. From this, you'll then create a network representation of the dataset using NetworkX.

The dataset is stored in the file asoiaf-all-edges.csv.

# Load edges into dataframes
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Source Target Type id weight
0 Addam-Marbrand Brynden-Tully Undirected 0 3
1 Addam-Marbrand Cersei-Lannister Undirected 1 3
2 Addam-Marbrand Gyles-Rosby Undirected 2 3
3 Addam-Marbrand Jaime-Lannister Undirected 3 14
4 Addam-Marbrand Jalabhar-Xho Undirected 4 3

Create a Graph

Now that you have the data loaded as a pandas DataFrame, iterate through the data and create appropriate edges to the empty graph you instantiated above. Be sure to add the weight to each edge.

# Create an empty graph instance

# Read edge lists into dataframes

Calculate Degree

To start the investigation of the most central characters in the books, calculate the degree centrality for each character. Then create a bar graph of the top 10 characters according to degree centrality.

#Your code here

Closeness Centrality

Repeat the above exercise for the top 10 characters according to closeness centrality.

#Your code here

Betweeness Centrality

Repeat the process one more time for betweeness centrality.

#Your code here

Putting it All Together

Great! Now try putting all of these metrics together along with eigenvector centrality. Combine all four metrics into a single dataframe for each character.

#Your code here

Identifying Key Players

While centrality can tell us a lot, you've also begun to see how certain individuals may not be the most central characters, but can be pivotal in the flow of information from one community to another. In the previous lesson, such nodes were labeled as 'bridges' acting as the intermediaries between two clusters. Try and identify such characters from this dataset.

#Your code here

Drawing the Graph

To visualize all of these relationships, draw a graph of the network.

#Your code here

Subsetting the Graph

As you can see, the above graph is undoubtedly noisy, making it difficult to discern any useful patterns. As such, reset the graph and only add edges whose weight is 75 or greater. From there, redraw the graph. To further help with the display, try using nx.spring_layout(G) for the position. To jazz it up, try and recolor those nodes which you identified as bridge or bottleneck nodes to communication.

#Your code here

Summary

In this lab, we looked at different centrality measures of the graph data for the ASIOF dataset. We also compared these measures to see how they correlate with each other. We also saw in practice, the difference between taking the weighted centrality measures and how it may effect the results.

About

License:Other


Languages

Language:Jupyter Notebook 100.0%