Programming Model

The Computation Model

Graph data is stored in the RAM of the cluster nodes using a key-value based storage model which provides location-transparency in accessing data. Node ( graph vertex) IDs are used as keys. A node object contains directed edges for other nodes in the graph. Beehive allows different types of edges to exist between two nodes; this facilitates applications involving hyper graph structures. The two abstractions in the Beehive computation model are Tasks and Transactions

Task: A Vertex-centric computation for a task is specific to the application program and the algorithm. A task reads and/or updates data of some vertices. A task can create new tasks on its completion.
Transaction: Each task is executed as a transaction based on the optimistic concurrency control model Kung & Robinson, 1981. In any case of read-write and write-write conflict among any concurrently executing transactional tasks, only one commits while the others are aborted.

Execution phases of a transactional task are based on the optimistic execution model Kung & Robinson, 1981 as illustrated below. Beehive's framework executes a Validation Service on one of the dedicated cluster nodes. A transactional task executes the following steps during its execution. If the validation phase results in abort, then the task is re-executed as a new transaction.

Transaction (task) acquires the Start-Timestamp when it begins execution.
Transaction executes Read and Compute phases
Validation service checks that no concurrent transactions committed after the start-timestamp has any read-write or write-write conflicts.
Validation service assigns a Commit-Timestamp to the transaction on commit
Transaction writes the updates to the global key-value storage
Adds any new tasks to the taskpool
Reports completion to the global validation service

The System Architecture

Each cluster node executes a Java process which contains the following components:

A local workpool containing tasks to be executed
Key-value based storage system called HashtableServer, storing a subset of the items in the globally shared storage syste
A pool of worker threads which fetch and execute tasks from the localworkpool

architecture