rchain / rchain

Blockchain (smart contract) platform using CBC-Casper proof of stake + Rholang for concurrent execution.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Strategy, categorization and handling exceptions

tgrospic opened this issue · comments

Overview

It's important to recognize what kind of errors can happen in node execution. Some errors can be result of network problems in which case node should be capable to recover and continue its job, but others can be fatal and recovery cannot be possible.

We need to design a unify way to categorize different types of errors and handle them appropriately. Also it's important to correctly chain errors thrown from lower layers so that final exception contains all relevant information. Unfortunately, in current code even this basic handling of exception is not done right.

Design

Basic knowledge about exceptions

  1. Swallowing exceptions is strictly forbidden
  2. Exceptions should NOT be used to control flow of the program
  3. Exceptions in cats effects works differently from Java exceptions
  4. Catching all Throwable (not NonFatal) errors is dangerous (see item above)
  5. Transforming exceptions to String looses information and should only be done on the exit point not inside domain logic
  6. Catching exceptions should be done on the latest point and not on every function call

API

Transforming of exceptions to other representation should be done only when error is handled and code will continue normal operation or error is given to external source or caller.

This is important for API implementations which is a boundary where errors must be transformed depending on the underlying protocol. We are all familiar with HTTP error codes and their meaning. gRPC basically works the same so defining custom ServiceError type in every response is obviously wrong.

message EventInfoResponse{
oneof message{
ServiceError error = 1;
BlockEventInfo result = 2;
}
}
message ExploratoryDeployResponse{
oneof message{
ServiceError error = 1;
DataWithBlockInfo result = 2;
}
}
// doDeploy
message DeployResponse {
oneof message {
ServiceError error = 1;
string result = 2;
}
}
// deployStatus
message DeployStatusResponse {
oneof message {
ServiceError error = 1;
DeployExecStatus deployExecStatus = 2;
}
}
message DeployExecStatus {
oneof status {
ProcessedWithSuccess processedWithSuccess = 1;
ProcessedWithError processedWithError = 2;
NotProcessed notProcessed = 3;
}
}
message ProcessedWithSuccess {
repeated Par deployResult = 1;
LightBlockInfo block = 2 [(scalapb.field).no_box = true];
}
message ProcessedWithError {
string deployError = 1;
LightBlockInfo block = 2 [(scalapb.field).no_box = true];
}
message NotProcessed {
string status = 1;
}
// getBlock
message BlockResponse {
oneof message {
ServiceError error = 1;
BlockInfo blockInfo = 2;
}
}
// visualizeDag
message VisualizeBlocksResponse {
oneof message {
ServiceError error = 1;
string content = 2;
}
}
// machineVerifiableDag
message MachineVerifyResponse {
oneof message {
ServiceError error = 1;
string content = 2;
}
}
// getBlocks
message BlockInfoResponse {
oneof message {
ServiceError error = 1;
LightBlockInfo blockInfo = 2;
}
}
// listenForDataAtName
message ListeningNameDataResponse {
oneof message {
ServiceError error = 1;
ListeningNameDataPayload payload = 2;
}
}
message ListeningNameDataPayload {
repeated DataWithBlockInfo blockInfo = 1;
int32 length = 2;
}
// listenForDataAtPar
message RhoDataResponse {
oneof message {
ServiceError error = 1;
RhoDataPayload payload = 2;
}
}
message RhoDataPayload {
repeated Par par = 1;
LightBlockInfo block = 2 [(scalapb.field).no_box = true];
}
// listenForContinuationAtName
message ContinuationAtNameResponse {
oneof message {
ServiceError error = 1;
ContinuationAtNamePayload payload = 2;
}
}
message ContinuationAtNamePayload {
repeated ContinuationsWithBlockInfo blockResults = 1;
int32 length = 2;
}
// findDeploy
message FindDeployResponse {
oneof message {
ServiceError error = 1;
LightBlockInfo blockInfo = 2;
}
}
message PrivateNamePreviewPayload {
repeated bytes ids = 1; // a la GPrivate
}
// lastFinalizedBlock
message LastFinalizedBlockResponse {
oneof message {
ServiceError error = 1;
BlockInfo blockInfo = 2;
}
}
// isFinalized
message IsFinalizedResponse {
oneof message {
ServiceError error = 1;
bool isFinalized = 2;
}
}
message BondStatusResponse {
oneof message {
ServiceError error = 1;
bool isBonded = 2;
}
}
message StatusResponse {
oneof message {
ServiceError error = 1;
Status status = 2;
}
}

More information about gRPC error handling.
https://www.grpc.io/docs/guides/error/

Categorization of errors

Exceptions as the name suggest represent exceptional situation which when happen should provide enough information for human (or program) to understand the nature of the error or be able to search for more data based on the error message.

For these purposes, errors usually contain an error code which is used to group similar type of errors or to directly identify the error.
For example HTTP error 404 means that requested resource is not found.
https://www.rfc-editor.org/rfc/rfc9110.html#name-404-not-found
Or HTTP 500 to represent any kind of server error.
https://www.rfc-editor.org/rfc/rfc9110.html#name-500-internal-server-error

Error codes are also useful to give users easier way to search for errors or even generate link to documentation based on provided error codes.