STEMHA / FlashGraph

A SSD-based graph processing engine for billion-node graphs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

There are two main components in the repository: FlashGraph and SAFS.

FlashGraph

FlashGraph is a semi-external memory graph processing engine, optimized for a high-speed SSD array. FlashGraph provides flexible programming interface to help users implement graph algorithms. In FlashGraph, users write serial code that reads data in memory and FlashGraph executes users' code in parallel and out of core. It enables us to process a billion-node graph in a single machine and has performance comparable to or exceed in-memory graph engines such as PowerGraph.

FlashGraph provides R bindings called FlashGraphR to help domain experts use FlashGraph. FlashGraphR has a wrapper for each graph algorithm implemented in FlashGraph. Domain experts can use FlashGraph to perform expensive graph algorithms and use the comprehensive R libraries to further analyze the result generated by FlashGraph. This combination provides domain experts to a powerful tool to analyze massive graphs (millions or even billions of vertices) very efficiently. FlashGraphR is further integrated with [iGraph] (http://igraph.org/) so that users can convert a FlashGraphR object to an iGraph object easily and vice versa.

SAFS

SAFS is an open-source library that provides a filesystem-like interface in the userspace to help users access a large SSD array in a NUMA machine. It is designed to eliminate overhead in the block subsystem in Linux, without modifying the kernel, and achieves the maximal performance of a large SSD array in a NUMA machine. FlashGraph is an application to demonstrate the power of SAFS.

Documentation

FlashGraph Quick start guide

FlashGraph programming tutorial.

FlashGraphR Quick Start

FlashGraph performance and scalability

SAFS user manual.

Publications

Heng Wang, Da Zheng, Randal Burns, Carey Priebe, Active Community Detection in Massive Graphs, SDM-Networks 2015 [pdf]

Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E. Priebe, Alexander S. Szalay, FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs, FAST'15, [pdf]

Da Zheng, Randal Burns, Alexander S. Szalay, Toward Millions of File System IOPS on Low-Cost, Commodity Hardware, in Proceeding SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, [pdf]

Contact

Mailing list: flashgraph-dev@googlegroups.com

Join the chat at https://gitter.im/icoming/FlashGraph

About

A SSD-based graph processing engine for billion-node graphs

License:Apache License 2.0


Languages

Language:C++ 94.8%Language:R 1.3%Language:Makefile 1.2%Language:Perl 1.1%Language:Shell 0.7%Language:CMake 0.6%Language:C 0.3%