billhu422 / gigapaxos

GigaPaxos is a group-scalable reconfigurable consensus system that can be used to efficiently and easily manage a very large number of independent lightweight replicated state machines.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gigapaxos

Obtaining gigapaxos

Option 1: Binary:

Option 2: Source:

  • Download gigapaxos from https://github.com/MobilityFirst/gigapaxos
  • In the main directory called gigapaxos, type ant, which will create a jar file dist/gigapaxos-<version>.jar. Make sure that ant uses java1.8 or higher.

GigaPaxos overview

GigaPaxos is a group-scalable replicated state machine (RSM) system, i.e., it allows applications to easily create and manage a very large number of separate RSMs. Clients can associate each service with a separate RSM of its own from a subset of a pool of server machines. Thus, different services may be replicated on different sets of machine in accordance with their fault tolerance or performance requirements. The underlying consensus protocol for each RSM is Paxos, however it is carefully engineered so as to be extremely lightweight and fast. For example, each RSM uses only ~300 bytes of memory when it is idle (i.e., not actively processing requests), so commodity machines can participate in millions of different RSMs. When actively processing requests, the message overhead per request is similar to Paxos, but automatic batching of requests and Paxos messages significantly improves the throughput by reducing message overhead, especially when the number of different RSM groups is small, for example, gigapaxos achieves a small-noop-request throughput of roughly 80K/s per core (and proportionally more on multicore) on commodity machines. The lightweight API for creating and interacting with different RSMs allows applications to “carelessly” create consensus groups on the fly for even small shared objects, e.g. a simple counter or a lightweight stateful servlet. GigaPaxos also has extensive support for reconfiguration, i.e., the membership of different RSMs can be programmatically changed by applications by writing their own policy classes.

GigaPaxos has a simple Replicable wrapper API that any “black-box” application can implement in order for it to be automatically be replicated and reconfigured as needed by GigaPaxos. This API requires three methods to be implemented:

boolean execute(Request request) 
String checkpoint(String name)
boolean restore(String name, String state) 

to respectively execute a request, obtain a state checkpoint, or to roll back the state of a service named “name”. GigaPaxos ensures that applications implementing the Replicable interface are also automatically Reconfigurable, i.e., their replica locations are automatically changed in accordance with an application- specified policy.

Tutorial 1: Single-machine Replicable test-drive

The default gigapaxos.properties file in the top-level directory has two sets of entries respectively for "active" replica servers and "reconfigurator" servers. Every line in the former starts with the string active. followed by a string that is the name of that active server (e.g., 100, 101 or 102 below) followed by the separator '=' and a host:port address listening address for that server. Likewise, every line in the latter starts with the string reconfigurator. followed by the separator and its host:port information

The APPLICATION parameter below specifies which application we will be using. The default is edu.umass.cs.reconfiguration.examples.noopsimple.NoopApp, so uncomment the APPLICATION line below (by removing the leading #) as we will be using this simpler "non-reconfigurable" NooPaxosApp application in this first tutorial. A non-reconfigurable application's replicas can not be moved around by gigapaxos, but a Reconfigurable application (such as NoopApp) replicas can.

#APPLICATION=edu.umass.cs.gigapaxos.examples.noop.NoopPaxosApp

active.100=127.0.0.1:2000
active.101=127.0.0.1:2001
active.102=127.0.0.1:2002

reconfigurator.RC0=127.0.0.1:3100
reconfigurator.RC1=127.0.0.1:3101
reconfigurator.RC2=127.0.0.1:3102

At least one active server is needed to use gigapaxos. Three or more active servers are needed in order to tolerate a single active server failure. At least one reconfigurator is needed in order to be able to reconfigure RSMs on active servers, and three or more for tolerating reconfigurator server failures. Both actives and reconfigurators use consensus, so they need at least 2f+1 replicas in order to make progress despite up to f failures. Reconfigurators form the "control plane" of giagpaxos while actives form the "data plane" that is responsible for executing client requests.

For the single-machine, local test, except for setting APPLICATION to NoopPaxosApp, you can leave the default gigapaxos.properties file as above unchanged with 3 actives and 3 reconfigurators even though we won't really be using the reconfigurators at all.

Run the servers as follows from the top-level directory:

./bin/gpServer.sh start all

If any actives or reconfigurators or other servers are already listening on those ports, you will see errors in the log file ( /tmp/gigapaxos.log ) by default). To make sure that no servers are already running, do

./bin/gpServer.sh stop all

To start or stop a specific active or reconfigurator, replace all above with the name of an active (e.g., 100) or reconfigurator (e.g., RC1) above.

Wait until you see all servers ready on the console before starting any clients.

Then, start the default client as follows from the top-level directory:

./bin/gpClient.sh

The client will by default use NoopPaxosAppClient if the application is NoopPaxosApp, and will use NoopAppClient if the application is the default NoopApp. As we are using the former app in this tutorial, running the above script will launch NoopPaxosAppClient.

For any application, a default paxos group called <app_name>0 will be created by the servers, so in this example, our (only) paxos group will be called NoopPaxosApp0.

The NoopPaxosAppClient client will simply send a few requests to the servers, wait for the responses, and print them on the console. The client is really simple and illustrates how to send callback-based requests. You can view its source here: NoopPaxosClient.java

NoopPaxosApp is a trivial instantiation of Replicable and its source is here: NoopPaxosApp.java

You can verify that stopping one of the actives as follows will not affect the system's liveness, however, any requests going to the failed server will of course not get responses. The sendRequest method in NoopPaxosAppClient by default sends each request to a random replica, so roughly a third of the requests will be lost with a single failure.

bin/gpServer.sh stop 101

Next, browse through the methods in NooPaxosAppClient's parent PaxosClientAsync.java and use one of the sendRequest methods therein to direct all requests to a specific active server and verify that all requests (almost always) succeed despite a single active failure. You can also verify that with two failures, no requests will succeed.

Tutorial 2: Single-machine Reconfigurable test-drive

For this test, we will use a fresh gigapaxos install and set APPLICATION to the default NoopApp by simply re-commenting that line as in the default gigapaxos.properties file as shown below.

#APPLICATION=edu.umass.cs.gigapaxos.examples.noop.NoopPaxosApp  
#APPLICATION=edu.umass.cs.reconfiguration.examples.noopsimple.NoopApp # default

Note: It is important to use a fresh gigapaxos install as the APPLICATION can not be changed midway in an existing gigapaxos directory; doing so will lead to errors as gigapaxos will try to feed requests to the application that the application will fail to parse. An alternative to a fresh install is to remove all gigapaxos logs as follows from their default locations (or from their non-default locations if you changed them in gigapaxos.properties):

rm -rf ./paxos_logs ./reconfiguration_DB

Next, run the servers and clients exactly as before. You will see console output showing that NoopAppClient creates a few names and successfully sends a few requests to them. A Reconfigurable application must implement slightly different semantics from just a Replicable application. You can browse through the source of NoopApp and NoopAppClient and the documentation therein below:

NoopApp.java [doc]

NoopAppClient.java [doc]

Step 1: Repeat the same failure scenario as above and verify that the actives exhibit the same liveness properties as before.

Step 2: Set the property RECONFIGURE_IN_PLACE=true in gigapaxos.properties in order to enable trivial reconfiguration, which means reconfiguring a replica group to the same replica group while going through all of the motions of the three-phase reconfiguration protocol (i.e., STOP the previous epoch at the old replica group, START the new epoch in the new replica group after having them fetch the final epoch state from the old epoch's replica group, and finally having the old replica group DROP all state from the previous epoch).

The default reconfiguration policy trivially reconfigures the replica group after every request. This policy is clearly an overkill as the overhead of reconfiguration will typically be much higher than processing a single application request (but it allows us to potentially create a new replica at every new location from near which even a single client request originates). Our goal here is to just test a proof-of-concept and understand how to implement other more practical policies.

Step 3: Run NoopAppClient by simply invoking the client command like before:

bin/gpClient.sh

NoopApp should print console output upon every reconfiguration when its restore method will be called with a null argument to wipe out state corresponding to the current epoch and again immediately after when it is initialized with the state corresponding to the next epoch for the service name being reconfigured.

Step 4: Inspect the default reconfiguration policy in

DemandProfile.java [doc]

and the abstract class

AbstractDemandProfile.java [doc]

that any application-specific reconfiguration policy is expected to extend in order to achieve its reconfiguration goals.

Change the default reconfiguration policy in DemandProfile so that the default service name NoopApp0 is reconfigured less often. For example, you can set MIN_REQUESTS_BEFORE_RECONFIGURATION and/or MIN_RECONFIGURATION_INTERVAL to higher values. There are two ways to do this: (i) the quick and dirty way is to change DemandProfile.java directly and recompile gigapaxos from source; (ii) the cleaner and recommended way is to write your own policy implementation, say MyDemandProfile, that either extends DemandProfile or extends AbstractDemandProfile directly and specify it in gigapaxos.properties by setting the DEMAND_PROFILE_TYPE property by uncommenting the corresponding line and replacing the value with the canonical class name of your demand profile implementation as shown below. With this latter approach, you just need the gigapaxos binaries and don't have to recompile it from source. You do need to compile and generate the class file(s) for your policy implementation.

#DEMAND_PROFILE_TYPE=edu.umass.cs.reconfiguration.reconfigurationutils.DemandProfile

If all goes well, with the above changes, you should see NoopApp reconfiguring itself less frequently as per the specification in your reconfiguration policy!

Troubleshooting tips: If you run into errors:

(1) Make sure the canonical class name of your policy class is correctly specified in gigapaxos.properties and the class exists in your classpath. If the simple policy change above works as expected by directly modifying the default DemandProfile implementation and recompiling gigapaxos from source, but with your own demand profile implementation you get ClassNotFoundException or other runtime errors, the most likely reason is that the JVM can not find your policy class.

(2) Make sure that all three constructors of DemandProfile that respectively take a DemandProfile, String, and JSONObject are overridden with the corresponding default implementation that simply invokes super(arg); all three constructors are necessary for gigapaxos' reflection-based demand profile instance creation to work correctly.

Step 5: Inspect the code in

NoopAppClient.java [doc]

to see how it is creating a service name by sending a CREATE_SERVICE_NAME request. A service name corresponds to an RSM, but note that there is no API to specify the set of active replicas that should manage the RSM for the name being created. This is because gigapaxos randomly chooses the initial replica group for each service at creation time. Applications are expected to reconfigure the replica group as needed after creation by using a policy class as described above.

Once a service has been created, application requests can be sent to it also using one of the sendRequest methods as exemplified in NoopAppClient.

Deleting a service is as simple as issuing a DELETE_SERVICE_NAME request using the same sendRequest API as CREATE_SERVICE_NAME above.

Note that unlike NoopPaxosAppClient above, NoopAppClient as well as the corresponding app, NoopApp use a different request type called AppRequest as opposed to the default RequestPacket packet type. Reconfigurable gigapaxos applications can define their own extensive request types as needed for different types of requests. The set of request types that an application processes is conveyed to gigapaxos via the Replicable.getRequestTypes() that the application needs to implement.

Applications can also specify whether a request should be paxos-coordinated or served locally by an active replica. By default, all requests are executed locally unless the request is of type ReplicableRequest and its needsCoordination method is true (as is the case for AppRequest by default).

Verify that you can create, send application requests to, and delete a new service using the methods above.

A list of all relevant classes for Tutorial 2 mentioned above is listed below for convenience:

NoopAppClient.java [doc]

NoopApp.java [doc]

AppRequest.java [doc]

ReplicableRequest.java [doc]

ClientRequest.java [doc]

ReconfigurableAppClientAsync.java [doc]

About

GigaPaxos is a group-scalable reconfigurable consensus system that can be used to efficiently and easily manage a very large number of independent lightweight replicated state machines.

License:Other


Languages

Language:Java 98.8%Language:Shell 1.2%