barryWhiteHat / roll_up

scale ethereum with snarks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prioirty que with plasma exit on failure

barryWhiteHat opened this issue · comments

  1. Assume centralized operator model.
  2. We have 'priority request queue' on chain. Operator is obliged to serve it.
  3. If operator fails to serve a request for N blocks, then we have a proovable data unavailability event as operator's fault.
  4. When and only when this event is triggered:
    4.1. Everyone is allowed to exit under the improved plasma rules described by @barryWhiteHat
    4.2. Anybody can object to stale claim, not only the owner.
    4.3. Opearator's security deposit is used to compensate gas and processing costs to everyone (it must be sufficient).
  5. If the failure was purely technical, operator can resume service anytime.

In this scenario there is no need for anybody to watch the chain 24/7 until unhappy event happens. If it happens, it will become news soon, and a lot of people will run watching software for a while, allowing everyone to exit safely. Emergency expenses are kept to a minimum. Operator can in no way profit from failure.

Originally posted by @gluk64 in #15 (comment)

@gluk64 for your priority que proposal we need to see what are the max size of the que that can be exited inside the time limit.

Then we need to think about what should happen when that que is full?

Should we extend the withdraw time?

How do we prevent to operator from dossing the que?

The queue is never full, iteams can be added anytime. The operator is only obliged to process N next items from the queue on every new sidechain block and provide the proof of this in circuit, otherwise the block is rejected by EVM. So the 'unavailability signal' occurs if there is no new sidechain block despite non-empty queue after M mainchain blocks.

In that case what is to stop the operator to fill this list with their own withdraw transactions and prvent legitimate users from exiting?

I think we would have to require the withdrawer to burn a small portion of their money the withdraw as well as burning some of the operators stake each time someone withdraws otherwise its a DOS target. Anyone can fill this que and prevent others from taking part.

BTW do you mean the que has snark transaction or withdraw transactions?

You're right. So we've ended with the fees race again :/

Ah! But what if we only allow exit requests in this queue, and they get sorted by amount? The attacker can continue flooding it only for a while, until all their funds are out.

But what if we only allow exit requests in this queue, and they get sorted by amount?

Then an attacker can fill the que with large amounts. But we could do something with fee that is burned for each entry in the que.

But in the case where data become unavailable (Say though some kind of fault) the operator would still want to dos the que to prevent getting their whole deposit burned. So we should also burn an equal portion of their stake. We don't need to burn it we could just set this aside to pay gas fees in the future just not let the operator withdraw it.

Basically every priority que entry should hurt the operator and the user equally. Its not a dos vulnerability its 2 of 2 scorched earth.

its 2 of 2 scorched earth

This is an unlimited griefing attack surface against operator. Griefing 1) must always have a bearable limit and 2) risk of griefing should ideally only be tolerated from a party with accumulable reputation (like operator), otherwise we're vulnerable to a mighty adversary willing to burn a lot of capital in order to discourage independent sidechains. This is why I'm worried about all fee-burning approaches.

The goal for emergency exit design is to make it possible for anybody to exit the side at limited cost within reasonable finite period of time. The worst case scenario is when everybody wants to exit at once. Then, for the last one to exit, at least O(N) requests will have to be processed on the root chain, where N is the number of leaves in the sidechain (num of accounts or assets). If N is known for any block, users can always decide whether want to continue bearing this risk.

Here's a design iteration for a priority queue which either will ensure cheap exit in O(N) time at worst, or will trigger plasma exit for everybody plus slashing operator's security deposit:

  1. Requests can be submitted for any operation on a leaf L (exit or intra-sidechain transfer)
  2. Requests must contain proof of ownership of the leaf L at some block height B
  3. Requests are sorted in EVM by B: earlist blocks must be served first.
  4. For every request (successful or not) operator must publish to a smartcontract on root chain the block height of the latest leaf L update, B_latest, along with the proof bound to the current merkle root.
  5. All subsequent requests for leaf L will only be accepted by EVM if B > B_latest.
  6. The queue failed event is triggered if the processing pace falls below W requests per week.

Thus, an attacker can not postpone any request for longer than processing O(N) other requests. Such an attack would be very costly and eventually fruitless.

W must be chosen in such a way as to ensure operator can post enough proofs to the root chain in one week even under heavy traffic. Censorship attacks against operator are always possible, of course, as with any plasma sidechain. Maybe worth a separate discussion.

A nuance: in order to prevent DOS through shuffling the queue, sorting must be enforced only for queue items starting from the position W. This will have no effect on the throughput of honest requests. Their execution can mereley be delayed by one week.

The smart contract with the map of latest blocks can be reused for plasma exit as well.

Only a part of operator's security deposit should be slashed each week, proprotionally to the number of underfullfilled requests. The slashing amount can be graceful in the beginning and grow geometrically over time. Operator is thus motivated to resume the service ASAP and a short-term technical problem will not lead to huge losses.

If we can slash the operator we can safely roll back an unavailable chain here is how

  1. Data become unavailable. We kill the current operator.
  2. We allow deposits for a new operator where anyone with the deposit selects a block X they want to continue to chain from.
  3. We select the operator with the highest X and biggest deposit.
  4. We start to roll the chain back allowing users who have data for the current block to withdraw.
  5. Once we get to block X we allow our new operator to start making blocks.

The reason we need slashing for each withdraw que entry is that otherwise teh old operator could just rejoin with new capital and DOS the system for another priority_que_timeout or even restart the chain from an invalid state. I guess they would eventually have to start the chain again but this way we punish censorship and forcing everything on to the chain.

So we can recover without everyone having to move to a new chain.

The reason we need slashing for each withdraw que entry is that otherwise teh old operator could just rejoin with new capital and DOS the system for another priority_que_timeout or even restart the chain from an invalid state. I guess they would eventually have to start the chain again but this way we punish censorship and forcing everything on to the chain.

Not sure I understand. Operator can not use new capital to DOS the queue, because new capital entries will always be at the tail of the queue, after all honest users who are exiting. As long as queue is working there is no need to slash operator, because transactions and exits can not be censored.

Not sure I understand. Operator can not use new capital to DOS the queue, because new capital entries will always be at the tail of the queue, after all honest users who are exiting. As long as queue is working there is no need to slash operator, because transactions and exits can not be censored.

The operator can clear the que, then add his dos requests and then make data unavailable.

So your most compelling argument to not slash the operator per request is

otherwise we're vulnerable to a mighty adversary willing to burn a lot of capital in order to discourage independent sidechains

So my answer to this is that the operator can degrade the side chain into an on chain exchange at little cost. You are defending the operator from bit external attackers. Where as i want to defend the users from a malicious operator. If a big attacker like that wants to attack a better use of their capital would be to censor blocks at the casper level and that would also be more effective and cheaper.

Also what are you thoughts on the restartable side chain proposal?

The operator can clear the que, then add his dos requests and then make data unavailable.

Yes. That's still O(N) time at worst, just like plasma exit. But it's more difficult, because filling the queue with requests is not enough. Operator must fill it with requests for leaves updated in the past.

Operator can fill the que with leaves that are not vailid in the current state and then slowly prove that they are no longer valid.

We start to roll the chain back allowing users who have data for the current block to withdraw.

Can you explain this part please? What does 'roll the chain back' mean?

Operator can fill the que with leaves that are not vailid in the current state and then slowly prove that they are no longer valid.

"Requests must contain proof of ownership of the leaf L at some block height B". One request per leaf. To prove that the request is wrong, operator will have to disclose the latest block height of update of leaf B, and this becomes mandatory for new requests.

Meanwhile, if I have a leaf which has been updated earlier, my request will go ahead of any new DOS request operator produces.

We start to roll the chain back allowing users who have data for the current block to withdraw.

Can you explain this part please? What does 'roll the chain back' mean?

so we have a list of states.

s1 - > s2 -> s3 -> s4 -> s5 

Where we start at s1 and end at s5

at s1 data is available and s5 it is not. But we do not know when it became unavailable.
So tell everyone we are going to roll back the state from s5 to s1. Does anybody

  1. Have a leaf that they want to withdraw at the state s5.
  2. Does anyone want to add state X and become the oppeator at s5.

If we don't get anyone for s2 we withdraw anyone who replied at to no1 and step the state back to s4 and repeat.

This way we start to go back in time and once an available state appears we will hopefully get a new operator.

Meanwhile, if I have a leaf which has been updated earlier, my request will go ahead of any new DOS request an operator produces.

Since the operator is in charge of ordering transactions she can always put here leaves at teh head of the que. Also because she has made data unavailble in this case she is able to claim membership of a state no one else is.

Oh, I see. Very interesting! Please open a new issue for state rollback, this is a vast separate topic. I see some fundamental problems with rollback (part of it is the attacks you described above), but this approach can hugely accelarate plasma exit and reduce its cost.

Since the operator is in charge of ordering transactions she can always put here leaves at teh head of the que

No, no, no, the priority queue is governed stictly by smart contracts on the root chain, operator has no control over it. I have implemented a sorted list on EVM before, this is doable with close to O(1) inserts.

Oh, I see. Very interesting! Please open a new issue for state rollback, this is a vast separate topic. I see some fundamental problems with rollback (part of it is the attacks you described above), but this approach can hugely accelarate plasma exit and reduce its cost.

#19

Operator must fill it with requests for leaves updated in the past.

Oh wait so its older leaves that get priority ?

Oh wait so its older leaves that get priority ?

Right

Oh wait so its older leaves that get priority ?

Right

The person who can make these state on mass is the prover. They can do a long range attack where they create leaves wait a long time and then dos the chain looking for a bribe at no cost.