borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.

Home Page:https://www.borgbackup.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

append-only mode is confusing

esledov opened this issue · comments

Here is the situation.

I created a repository in append-only mode (checked it in config file). I made my first backup from remote server. Then I deleted this backup again from the client, and when I issue "borg list" nothing is there. My first thought is that "append only" flag does not work at all.

And after reading this page I see that apparently the archive is only marked as deleted and I now am supposed to manually delete some files from the repository after looking at some list of transactions that does not even list operation names.

Wouldn't it be much easier if delete command was outright rejected if repository is in "append only" mode?

(I am using borg version 1.1.4 installed via pip on debian 9)

commented

The point of append-only mode is that borg operates normally from a user perspective.
So if you delete archives or run prune to thin out old archives, everything looks normal.

Optionally, you could manually check the state of your repos. If everything looks fine you could manually switch off append-only temporarily and do some write operation to your repo, so it will run compact_segments and realize all the queued deletions, freeing up all the space.

Only in the case of emergency (== your server got hacked, hacker got access to borg repo, used borg to "delete" your archives) you would use the transaction log to undo that.

commented

The repository side ("borg serve") usually does not receive highlevel commands (like "delete archive X"), but rather low-level "PUT(chunkid, content)", "DEL(chunkid)" or "COMMIT()".

If it would reject all DEL, you could not use prune or delete and your archive list would grow to unmanageable lengths. Even intermediary checkpoint archives would show up or not work as they do now.

The point of append-only mode is that borg operates normally from a user perspective.
So if you delete archives or run prune to thin out old archives, everything looks normal.

I consciously stated that I want the repository to be "append only".
A fact that stuff appears to be deleted from the repository after that does not look normal for me.

May be it looks normal to the attacker, but I don't think you should be care much about his feelings.

If it would reject all DEL, you could not use prune or delete and your archive list would
grow to unmanageable lengths.

I can set append-only to 0 and do management tasks from machine where repository is located or from a trusted client.

(Sorry, I just waited for this feature, was doing backups of backups to avoid the possibility of them to be deleted).

commented

Well, that was before you read the docs about append-only mode.
So you had some assumptions about how it works that were just not true.

"append-only" refers to the low level structure of the repository, as you can see in the docs section describing append-only mode.

If "append only" has a special meaning for you and is not meant to work this way,
can you make a switch to allow only "borg create" from a remote location?

commented

As I said: the repository ("borg serve" side) usually receives low-level chunk-level commands from the borg client. So it usually does not know about archive-level commands.

commented

Hmm, guess a link from "borg init" usage docs to the "append-only" section would be helpful, right?

I have to agree that this is not ideal. If I understand the documentation correctly, operations in append-only mode such as prune or delete are simply delayed until someone runs such a command without append-only mode. If for example the client (who must not delete data in this case) runs a delete command, how am I supposed to know that? The transaction log is not helpful at all in this case.

commented

The intention of append-only is to solve the "a bad guy hacked my box and used the borg client to delete all my backups on the borg server" problem.

For that to work, it is assumed that you will notice that your repo is in an unwanted state before issuing any command in non-append-only mode. This can be trivial or not, depending on how much energy the attacker invested to deceive you. But IIRC we already have another ticket about this, so let's not discuss that here.

You only use the transaction log to determine what files to remove from the repo after determining the time of the last good / first harmful repo accesses. If that is not easily possible, you could also go back in time step-by-step.

This can be trivial or not, depending on how much energy the attacker invested to deceive you.

In my opinion, attacker is not supposed to be able to deceive me at all.

In the meantime, would it be possible to add to transaction log a command that started this transaction.

Something like

transaction 1, create, UTC time 2018-01-02T05:21:33.388802
transaction 3, create, UTC time 2018-01-02T05:23:54.247742
transaction 5, delete, UTC time 2018-01-02T05:24:50.085596
transaction 7, create, UTC time 2018-01-02T05:33:33.836055

So that I can at least identify the transaction that caused something destructive?

commented

@esledov there is no RPC api for that. And even if there was one, the client is controlled by the attacker, so he could use the api to tell "create" at the beginning and later issuing arbitrary DEL commands. Or "prune" just at the same time you normally pruned, just killing more.

append-only is perhaps named a little unlucky - perhaps something like persist-all-transactions or persistent-transactions would better convey the meaning

There really isn't much positive to say about append only, especially since the better replacement that was hoped for never materialized and you pretty much have to read half the internal docs to understand what it is doing.

For that to work, it is assumed that you will notice that your repo is in an unwanted state before issuing any command in non-append-only mode.

That's the flaw here IMO. How am I supposed to check that on a dozen servers everyday?

At first I thought that ---append-only just disabled delete and prune commands, but it's way more complicated than that, sadly.

commented

https://github.com/borgbackup/borg/pull/3970/files add some docs about compaction there.

commented

fixed by #4384.

If it would reject all DEL, you could not use prune or delete and your archive list would grow to unmanageable lengths. Even intermediary checkpoint archives would show up or not work as they do now.

Sorry to dig up such an old thread, but this looks promising to me.
My question is therefore:
Is the only drawback of forbidding all DEL commands, that no prune or delete can be issued from the client?
Because in that case, for experienced users and guarded well by documentation this could be a useful feature, e.g. with the setting described now:

Say, the user has a server she --for whatever reason-- trusts. If borg serve could deny DEL commands, the user could set up an ssh forced command, that only ever allows the clients to add archives.
To keep the number of archives from growing, the user may set up a cronjob or the like on the trusted server (or even a different trusted client, if the server is also untrusted) to prune archives.

Therefore, if preventing DEL commands has no repercussions other than preventing client-initiated prunes, this would be a possible route. Do you agree?
In case yes, but no-one with knowledge of the borg code wants to work on this,
where in the borg code would I go to try this out?

I'm very dissatisfied with the github ui: I now see, that you cannot see, from which post I quoted. It was this one:
#3504 (comment)

There @ThomasWaldmann seems to imply, that rejecting DEL commands to the server would only make this client incapable of ever issuing prune and delete commands. That is a price, I (and presumably many others) would be willing to pay.
It just means, that I need another fully trusted client to run the prune commands. For a server that backs up many clients this seems like a win, as then only hacking the single trusted admin client could kill the backups.

Also, I just realized, my comment is made obsolete by these two issues:

#1772
#2251

Sorry to have cost you time.