append-only mode is confusing

Question

append-only mode is confusing

esledov opened this issue 7 years ago · comments

Here is the situation.

I created a repository in append-only mode (checked it in config file). I made my first backup from remote server. Then I deleted this backup again from the client, and when I issue "borg list" nothing is there. My first thought is that "append only" flag does not work at all.

And after reading this page I see that apparently the archive is only marked as deleted and I now am supposed to manually delete some files from the repository after looking at some list of transactions that does not even list operation names.

Wouldn't it be much easier if delete command was outright rejected if repository is in "append only" mode?

(I am using borg version 1.1.4 installed via pip on debian 9)

Alex JOST commented 7 years ago

#2251

TW · Answer 1 · Tue Jan 02 2018 21:14:58 GMT+0800 (China Standard Time)

The point of append-only mode is that borg operates normally from a user perspective.
So if you delete archives or run prune to thin out old archives, everything looks normal.

Optionally, you could manually check the state of your repos. If everything looks fine you could manually switch off append-only temporarily and do some write operation to your repo, so it will run compact_segments and realize all the queued deletions, freeing up all the space.

Only in the case of emergency (== your server got hacked, hacker got access to borg repo, used borg to "delete" your archives) you would use the transaction log to undo that.

TW · Answer 2 · Tue Jan 02 2018 21:22:06 GMT+0800 (China Standard Time)

The repository side ("borg serve") usually does not receive highlevel commands (like "delete archive X"), but rather low-level "PUT(chunkid, content)", "DEL(chunkid)" or "COMMIT()".

If it would reject all DEL, you could not use prune or delete and your archive list would grow to unmanageable lengths. Even intermediary checkpoint archives would show up or not work as they do now.

Egor Sledov · Answer 3 · Tue Jan 02 2018 21:33:57 GMT+0800 (China Standard Time)

The point of append-only mode is that borg operates normally from a user perspective.
So if you delete archives or run prune to thin out old archives, everything looks normal.

I consciously stated that I want the repository to be "append only".
A fact that stuff appears to be deleted from the repository after that does not look normal for me.

May be it looks normal to the attacker, but I don't think you should be care much about his feelings.

If it would reject all DEL, you could not use prune or delete and your archive list would
grow to unmanageable lengths.

I can set append-only to 0 and do management tasks from machine where repository is located or from a trusted client.

(Sorry, I just waited for this feature, was doing backups of backups to avoid the possibility of them to be deleted).

TW · Answer 4 · Tue Jan 02 2018 21:37:36 GMT+0800 (China Standard Time)

Well, that was before you read the docs about append-only mode.
So you had some assumptions about how it works that were just not true.

"append-only" refers to the low level structure of the repository, as you can see in the docs section describing append-only mode.

Egor Sledov · Answer 5 · Tue Jan 02 2018 21:41:46 GMT+0800 (China Standard Time)

If "append only" has a special meaning for you and is not meant to work this way,
can you make a switch to allow only "borg create" from a remote location?

TW · Answer 6 · Tue Jan 02 2018 21:50:14 GMT+0800 (China Standard Time)

As I said: the repository ("borg serve" side) usually receives low-level chunk-level commands from the borg client. So it usually does not know about archive-level commands.

TW · Answer 7 · Tue Jan 02 2018 21:56:30 GMT+0800 (China Standard Time)

Hmm, guess a link from "borg init" usage docs to the "append-only" section would be helpful, right?

Alex JOST · Answer 8 · Tue Jan 02 2018 21:58:11 GMT+0800 (China Standard Time)

I have to agree that this is not ideal. If I understand the documentation correctly, operations in append-only mode such as prune or delete are simply delayed until someone runs such a command without append-only mode. If for example the client (who must not delete data in this case) runs a delete command, how am I supposed to know that? The transaction log is not helpful at all in this case.

TW · Answer 9 · Tue Jan 02 2018 22:08:13 GMT+0800 (China Standard Time)

The intention of append-only is to solve the "a bad guy hacked my box and used the borg client to delete all my backups on the borg server" problem.

For that to work, it is assumed that you will notice that your repo is in an unwanted state before issuing any command in non-append-only mode. This can be trivial or not, depending on how much energy the attacker invested to deceive you. But IIRC we already have another ticket about this, so let's not discuss that here.

You only use the transaction log to determine what files to remove from the repo after determining the time of the last good / first harmful repo accesses. If that is not easily possible, you could also go back in time step-by-step.

Egor Sledov · Answer 10 · Tue Jan 02 2018 22:26:57 GMT+0800 (China Standard Time)

This can be trivial or not, depending on how much energy the attacker invested to deceive you.

In my opinion, attacker is not supposed to be able to deceive me at all.

In the meantime, would it be possible to add to transaction log a command that started this transaction.

Something like

transaction 1, create, UTC time 2018-01-02T05:21:33.388802
transaction 3, create, UTC time 2018-01-02T05:23:54.247742
transaction 5, delete, UTC time 2018-01-02T05:24:50.085596
transaction 7, create, UTC time 2018-01-02T05:33:33.836055

So that I can at least identify the transaction that caused something destructive?

TW · Answer 11 · Tue Jan 02 2018 23:54:06 GMT+0800 (China Standard Time)

@esledov there is no RPC api for that. And even if there was one, the client is controlled by the attacker, so he could use the api to tell "create" at the beginning and later issuing arbitrary DEL commands. Or "prune" just at the same time you normally pruned, just killing more.

Ronny Pfannschmidt · Answer 12 · Wed Jan 03 2018 15:18:44 GMT+0800 (China Standard Time)

append-only is perhaps named a little unlucky - perhaps something like persist-all-transactions or persistent-transactions would better convey the meaning

enkore · Answer 13 · Thu Jan 04 2018 01:18:42 GMT+0800 (China Standard Time)

There really isn't much positive to say about append only, especially since the better replacement that was hoped for never materialized and you pretty much have to read half the internal docs to understand what it is doing.

Stanislas · Answer 14 · Sat Jan 27 2018 05:49:29 GMT+0800 (China Standard Time)

For that to work, it is assumed that you will notice that your repo is in an unwanted state before issuing any command in non-append-only mode.

That's the flaw here IMO. How am I supposed to check that on a dozen servers everyday?

At first I thought that ---append-only just disabled delete and prune commands, but it's way more complicated than that, sadly.

TW · Answer 15 · Fri Jul 13 2018 06:18:18 GMT+0800 (China Standard Time)

https://github.com/borgbackup/borg/pull/3970/files add some docs about compaction there.

TW · Answer 16 · Mon Mar 11 2019 03:36:30 GMT+0800 (China Standard Time)

fixed by #4384.

eike-fokken · Answer 17 · Sun May 22 2022 19:29:09 GMT+0800 (China Standard Time)

If it would reject all DEL, you could not use prune or delete and your archive list would grow to unmanageable lengths. Even intermediary checkpoint archives would show up or not work as they do now.

Sorry to dig up such an old thread, but this looks promising to me.
My question is therefore:
Is the only drawback of forbidding all DEL commands, that no prune or delete can be issued from the client?
Because in that case, for experienced users and guarded well by documentation this could be a useful feature, e.g. with the setting described now:

Say, the user has a server she --for whatever reason-- trusts. If borg serve could deny DEL commands, the user could set up an ssh forced command, that only ever allows the clients to add archives.
To keep the number of archives from growing, the user may set up a cronjob or the like on the trusted server (or even a different trusted client, if the server is also untrusted) to prune archives.

Therefore, if preventing DEL commands has no repercussions other than preventing client-initiated prunes, this would be a possible route. Do you agree?
In case yes, but no-one with knowledge of the borg code wants to work on this,
where in the borg code would I go to try this out?

Jonas Olson · Answer 18 · Sun May 22 2022 21:11:39 GMT+0800 (China Standard Time)

On 2022-05-22 13:29, eike-fokken wrote: If borg serve could deny DEL commands, the user could set up an ssh forced command, that only ever allows the clients to add archives. To keep the number of archives from growing, the user may set up a cronjob or the like on the trusted server (or even a different trusted client, if the server is also untrusted) to prune archives.

This is exactly what I too would love to be able to do. The explanation I have received for why this behaviour is not available, as I have understood it, is as follows: The Borg repository is just a key-value store, and the server is not allowed to know about any further structure in the data being stored. Specifically, the server is not allowed know about the individual archives (backup runs) and see what chunks (even if encrypted) belongs to what archive. Therefore, the server cannot delete an individual archive. Personally, I am totally fine with the server knowing about when I create or delete each backup. Much of that should be apparent from the traffic anyway, so I don't consider it something that even can be kept secret from the server. Sincerely, Jonas Olson

eike-fokken · Answer 19 · Mon May 23 2022 02:27:48 GMT+0800 (China Standard Time)

I'm very dissatisfied with the github ui: I now see, that you cannot see, from which post I quoted. It was this one:
#3504 (comment)

There @ThomasWaldmann seems to imply, that rejecting DEL commands to the server would only make this client incapable of ever issuing prune and delete commands. That is a price, I (and presumably many others) would be willing to pay.
It just means, that I need another fully trusted client to run the prune commands. For a server that backs up many clients this seems like a win, as then only hacking the single trusted admin client could kill the backups.

eike-fokken · Answer 20 · Mon May 23 2022 02:49:01 GMT+0800 (China Standard Time)

Also, I just realized, my comment is made obsolete by these two issues:

#1772
#2251

Sorry to have cost you time.