andreacarrara / pagerank

C implementation of Google's PageRank algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PageRank

Ranking webpages applying linear algebra: C implementation of Google's PageRank algorithm.

Example

In an interconnected web of pages, how can we meaningfully define and quantify the importance of any given page? Let's walk through an example to better understand how the algorithm works.

Suppose we have the following web made of four pages linking each other:

Loading it is as easy as:

Let's start by creating a model of the web.
How many pages does your web have? 4
How many links does page 1 have? 3
1. What page does page 1 link? 2
...

Every webpage must link at least another page and no webpage can link itself.

The score of any given page is derived from the links made to that webpage from other pages. The web thus becomes a democracy where pages vote for the importance of other pages by linking to them.

Let's store our web in a link matrix where entry (i, j) represents a link from page j to page i:

Non-zero entry (i, j) equals to 1 over the number of links of page j. In this democracy of the web each page gets a total of one vote that is evenly divided up among all of its links.

This transforms the web ranking problem into the standard problem of finding an eigenvector for a square matrix. Applying the power iteration method, we compute the score column as the limit of iterations

where S is a column with all entries equal to 1 over the number of webpages, and m a real number between 0 and 1. X can be initialized as any column with positive components and norm equal to 1.

Lastly, we print the results:

Here are the standings:
1. Page 1: 0.368150
2. Page 3: 0.287969
3. Page 4: 0.202081
4. Page 2: 0.141801

Usage

Clone the repository and run:

$ gcc pagerank.c -lm -o pagerank
$ ./pagerank

Resources

Project inspired by this paper by Kurt Bryan and Tanye Leise. Further information here from lecture 5 to 11.

About

C implementation of Google's PageRank algorithm


Languages

Language:C 100.0%