yungene / Visualising-GitHub

CS3012 - "Social Graph"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Visualising-GitHub

CS3012 - "Social Graph"

Part 1 - GitHub Access

Commit which identifies the solution to this part: bf944d30d144e70019e2a4b3d102234f17ec43b6 Please, see 3146ffa8583af581a31880ea25903391a264513b for the program that allows OAuth2 token athorization.

A little program under /Github-acess demonstrates the use of Eclipse EGit Github library for Java to retrieve and display data about a Github's user repositories and the number of "watchers" for these public repositories. In this case, data about myself is retrieved and displayed. More details can be found in README in the directory of the program.

The program is built using Apache Maven with a POM included in the directory. The program was built and ran using Eclipse Photon with Maven M2E plugin.

Part 2 - Visualising GitHub

Important links:

Task

Task: Interrogate the GitHub API to build visualisation of data available that elucidates some aspect of the software engineering process, such as a social graph of developers and projects, or a visualisation of individual of team performance.

Project Information

Note: Please see Design Doc for full description of the project and its development.

The project visualises information about repositories on GitHub. On the main page, the only metric presented is a line (area) chart of active development team size vs time with releases (tags) also shown on a time axis. (See below for additional network graph). This visualisation indents to investigate whether there is correlation between active team size and time with respect to releases. Initial conjecture was that active team size will increase prior to release and will peak at the time of release and will decreases straight after the release. The project consists of three logical parts: data collector, database and website. Website extracts all the processed repositories from the database and presents it to user by means of a drop down menu. Data for the chosen repository will be read from the database and presented to the user as an interactive graph. By hovering over the graph, the user can see the exact coordinates of each point and also see the release that this point is working towards. The user can zoom in by brushing on the graph. That is click and drag, and release to zoom in, inversely proportional to the area covered by the movement. Double click to reset the scaling.

Usage

Use select box to select the desired reposiroty to be displayed. List is generated based on the data in the database. List names follow this format "repo_owner,repo_name,days_backfill,threshold". See the headings below and title of the graph for the parameters of the current graph. This is a line (area) chart. It has three main components. Y-axis displays the active team size as measured using the metric described in the design doc. X-axis displays the time/date. Thus a change of team size with the respect to time is displayed. Red vertical lines represent individual releases (tags) in that repository. Tags are bound to the time values. This graph is interactive. Hover over a point to see the exact value as well as the name of the release this commit is conributing to. Name of the release is displayed above the release line. Additionally, brush on the graph to zoom in. Double click to reset the scaling. That is click and drag, and release to zoom in, inversely proportional to the area covered by the movement.

Demo page. (With sample prefetched data. Original webpage includes a backend and a database. Should be online and up to date.)

Pattern that similar to that described by the conjecture: Pattern that similar tot he conjecture.

Demonstration of the webpage, including select box, zoom in and cursor: Demonstration of the webpage, including select box, zoom in and cursor.

Followers graph

As an addition to the main metric described above, I also created a second page /users which presents a followers/followings graph. That is it builds a directed graph with nodes for users and edges for "is followed by" and "is following" relations. It is a Force Directed Graph with Labels. TI created a separated crawler for this. The crwaler does a simple BFS starting from my account.

Demo page. (With sample prefetched data. Original webpage includes a backend and a database. Should be online and up to date.)

Screenshot Followers network.

Demonstration of the webpage for graph: Demonstration of the webpage for graph.

Technologies used

Project Setup

The project consists of 3 main parts:

  • MySQL local DB
  • Data collector in Java
  • Local Node.js webserver

At present in order to run the project all three components are required. TODO: allow to run a website without a database using some sample data.

MySQL database

  • Create the DB and tables using the script under /MySQL/init.sql. Use /MySQL/insert_sample_repos.sql to insert a list of sample repos to the DB. THis list will be used by the data collector.
  • Create the database configs using your credentials for your local MySQL instance. Under /data-collector/ create a config folder. Under /data_collector/config/ create a dbconfig.properties file. Insert and modify the template:
 host=localhost
 user=root
 password=****
 db=dbName
 port=3306
  • Under /visual/ create a config folder. In this folder create a config.js file. Indert and modify the template:
var config = {
development: {
    //url to be used in link generation
    url: 'http://my.site.com',
    //mysql connection settings
    database: {
        host: 'localhost',
        user: 'root',
        password: '****',
        db:     'dbName',
        port: "3306"
    },
    //server details
    server: {
        host: '127.0.0.1',
        port: '3422'
    }
}};
module.exports = config;

Data Collector

  • Ensure that Java 8 is installed.
  • Ensure that Maven is installed. Maven 3.5.3 was used. More specifically, it was tested to work with M2E v1.9.0.
  • Build and install the Maven dependecies listed in data-collector/pom.xml. This includes org.eclipse.egit.github.core library and MySQL Java connector.
  • Ensure that MySQL Connector/J for Java is installed. It was tested to work with Connector/J 8.0.18. It should be installed by Maven. Otherwise install manually and link with the project.
  • To ensure that the collector is less likely to hit a rate limit, provide an OAuth 2 GitHub token. Under /data-collector/ create a config folder. Under /data_collector/config/ create a config.txt file. Insert and modify the template:
*token*

  • Run the Main.java file. See getRepos() function to see how the repositories to be porcessed are supplied. For graph, run GraphMain.

Website

Install:
npm install
Start webserver:
npm start
Access local webpage
http://localhost:3000
http://localhost:3000/users

About

CS3012 - "Social Graph"


Languages

Language:HTML 61.2%Language:Java 24.0%Language:JavaScript 10.2%Language:TSQL 4.5%Language:CSS 0.2%