Skip to content

A benchmark suite for performance introspection of graph databases

Notifications You must be signed in to change notification settings

pmacko86/pig-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

PIG: Performance Introspection of Graph Databases

The explosion of graph data in social and biological networks, recommendation systems, provenance databases, etc. makes graph storage and processing of paramount importance.

PIG is a new graph benchmarking framework, which provides both a methodology for evaluating graph database performance and a mechanism to carry out such evaluations. It takes a hierarchical approach to benchmarking. The suite has three layers of benchmarks:

  • Primitive operations such as reading and writing vertices and edges
  • Composite access patterns such as extracting k-hop neighborhoods
  • Graph algorithms such as shortest path finding and computing centrality metrics

This framework allows for comparisons between systems as well as single system introspection. Such introspection allows one to evaluate the degree to which systems exploit their knowledge of graph access patterns. The suite also comes with a web interface that makes it easy to run benchmarks and to visualize and analyze the collected data.

Quick-Start

To run PIG, you will need:

After installing all the prerequisites and checking out the source code of PIG, cd into the graphdb-bench directory and type:

mvn install

You can then start the web interface using:

./runWebInterfaceServer.sh

This will start a server on port 8080. Or you can run the benchmark tools directly from the command-line using:

./runBenchmarkSuite.sh`

Use the --help option to get the list of available commands or +help to see advanced options and options for configuring the JVM.

Configuration

To edit the configuration of PIG, please edit the following file:

graphdb-bench/src/main/resources/com/tinkerpop/bench/bench.properties`

You can also override many options using command-line arguments and/or the web interface.

Datasets

You can generate your own datasets using fgftool distributed as a part of Blueprints Extensions (one of the prerequisites of PIG). You can also download datasets with up to 1 million nodes from here:

https://drive.google.com/folderview?id=0B3jkRHQ7nKvnbDhsWHBySVV6VVk&usp=sharing

Place the datasets in the directory specified in the configuration file. The default is data/datasets in the project directory.

Publications

  • Peter Macko, Daniel Margo, and Margo Seltzer. Performance Introspection of Graph Databases. 6th International Systems and Storage Conference (SYSTOR '13), Haifa, Israel, June 2013. (pdf)

About

A benchmark suite for performance introspection of graph databases

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •