vgough / encfs

EncFS: an Encrypted Filesystem for FUSE.

Home Page:https://vgough.github.io/encfs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data loss with Dropbox

internationils opened this issue · comments

Hello,
I have been having issues using encfs with dropbox for a while, and finally dug down to try and understand when they surface. https://superuser.com/questions/949066/input-output-errors-using-encfs-folder-inside-dropbox-folder sums it up pretty well.
Basically, having a dropbox folder with encfs inside on 2 different hosts (Debian/Ubuntu in my case) can cause files to become unreadable for one host, the other host, or both (data loss). There are partial remedies, and the fix is apparently connected to running without a path-based IV (as I understand it).
The remedy (see the scripts in the next post) is scanning for input/output errors on one host, and then on the other host moving the files out of and back into the encfs (and vice versa).
The fix is (according to the SU reply):

If you are running encfs in the "maximum security" mode or you have enabled "filename to IV header chaining" in will break on any Dropbox-like service. Don't enable it. Actually, don't ever use it, it's just plain stupid to rely upon the file path for the file data encryption IV.
I would use "stream" filename encoding and only "per-file initialization vectors" and "File holes passed through to ciphertext" features to make encfs reliable.

I would suggest either a) figuring out what is going on and fixing it (i.e. run 2 hosts which create and modify files in the same dropbox/encfs FS, and hourly scan for errors), or at least b) add this to the FAQ somewhere.

#!/bin/sh
# find-corruption.sh
Date=`/bin/date -Iminutes`
Path=`/bin/pwd | /usr/bin/awk '{gsub("/","_",$0)}1'`
Host=`hostname`
Tmpfile="./dropbox-currupt-$Host.txt"
Destfile="./corrupt-$Host$Path-$Date.txt"
/usr/bin/find . -exec /usr/bin/file '{}' \; | /bin/grep "output error" > $Tmpfile
/bin/cat $Tmpfile | /usr/bin/awk -F  ":" '{print $1}' > $Destfile
/usr/bin/wc $Destfile
if [ ! -s $Destfile ] ; then
   echo Removing zero length $Destfile
   rm $Destfile
fi
#!/bin/bash
# fix-corruption.sh
# TODO: cannot interrupt the script, as the file remains in /tmp/crap ... 

echo $0 called with $# arguments
if [ $# -ne 1 ]; then
    echo "illegal number of parameters"
    exit;
fi

filelist=$1
if [ ! -r $filelist ]; then
   echo "ERROR: $filelist is not readable!"
   exit;
fi
DIR=$(dirname "${filelist}")

# IFS == input field separator, should be set to empty for the duration of this command
# "IFS= " (with space) sets IFS to empty
# putting it on the same line as « read » means that it's temporary to that one command

while IFS= read -r corrupted <&3; do
    # check if file is corrupted on this machine as well
    if /usr/bin/file "$corrupted" >/dev/null 2>&1 ; then
        echo FIXING: $corrupted
        #if not, fix it
        mv "$corrupted" /tmp/crap
        sleep 5
        mv /tmp/crap "$corrupted"
        sleep 1
    else
        #if it is corrupt here as well, skip it
        echo $corrupted >> $DIR/remainingCorruptedFiles.txt
        echo BROKEN: $corrupted corrupted on this host as well
    fi;
done 3<$filelist
rm $filelist

Do both computers mount with --nocache ?

No, neither does.

They should if both access the EncFS files at the same time :
This makes sure that modifications to the backing files that occour outside EncFS show up immediately in the EncFS mount.

Well, the dropbox on both is running at the same time, but the access to the unencrypted files is from one OR the other only (one is a desktop, one is a laptop for travel). I'll set it on both though, thanks.

If you mount one at a time, not both at the same time, this option will then not help.

They are both mounted most of the time so I can work at home or on the road on the same files without having to remember to sync, just not actively accessed by me at the same time (obviously).

Let then us know if --nocache helps 👍
But as the cache seems to generally be about 1 second of time, I'm not sure it will do the trick...

I'm not about to risk losing more files, sorry ;) I've set --nocache, and also followed the suggestion from the SU post (see my initial post here) about IV chaining and stream filename encoding - looking OK so far.

Really not sure however why IV chaining would cause such an issue, as the chain does not change (appart from the cache effect, this is why I proposed the --nocache option).

Here's my setup, just to document that too.

### using ENCFS with DROPBOX: nonstandard configuration is needed!!!
$ /usr/bin/encfs --nocache -oallow_other /mnt/dropboxes/dropbox/Dropbox/enc /mnt/dropboxes/encmount
Linux blackbox 4.10.0-32-generic #36-Ubuntu SMP Tue Aug 8 12:10:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
mounting /mnt/dropboxes/dropbox/Dropbox/enc /mnt/dropboxes/encmount
Creating new encrypted volume.
Please choose from one of the following options:
 enter "x" for expert configuration mode,
 enter "p" for pre-configured paranoia mode,
 anything else, or an empty line will select standard mode.
?> x

Manual configuration mode selected.
The following cipher algorithms are available:
1. AES : 16 byte block cipher
 -- Supports key lengths of 128 to 256 bits
 -- Supports block sizes of 64 to 4096 bytes
2. Blowfish : 8 byte block cipher
 -- Supports key lengths of 128 to 256 bits
 -- Supports block sizes of 64 to 4096 bytes

Enter the number corresponding to your choice: 1

Selected algorithm "AES"

Please select a key size in bits.  The cipher you have chosen
supports sizes from 128 to 256 bits in increments of 64 bits.
For example: 
128, 192, 256
Selected key size: 256

Using key size of 256 bits

Select a block size in bytes.  The cipher you have chosen
supports sizes from 64 to 4096 bytes in increments of 16.
Or just hit enter for the default (1024 bytes)

filesystem block size: 

Using filesystem block size of 1024 bytes

The following filename encoding algorithms are available:
1. Block : Block encoding, hides file name size somewhat
2. Block32 : Block encoding with base32 output for case-insensitive systems
3. Null : No encryption of filenames
4. Stream : Stream encoding, keeps filenames as short as possible

Enter the number corresponding to your choice: 4

Selected algorithm "Stream""

Enable filename initialization vector chaining?
This makes filename encoding dependent on the complete path, 
rather then encoding each path element individually.
[y]/n: n

Enable per-file initialization vectors?
This adds about 8 bytes per file to the storage requirements.
It should not affect performance except possibly with applications
which rely on block-aligned file io for performance.
[y]/n: y

External chained IV disabled, as both 'IV chaining'
and 'unique IV' features are required for this option.
Enable block authentication code headers
on every block in a file?  This adds about 12 bytes per block
to the storage requirements for a file, and significantly affects
performance but it also means [almost] any modifications or errors
within a block will be caught and will cause a read error.
y/[n]: n

Add random bytes to each block header?
This adds a performance penalty, but ensures that blocks
have different authentication codes.  Note that you can
have the same benefits by enabling per-file initialization
vectors, which does not come with as great of performance
penalty. 
Select a number of bytes, from 0 (no random bytes) to 8: 

Enable file-hole pass-through?
This avoids writing encrypted blocks when file holes are created.
[y]/n: y

Configuration finished.  The filesystem to be created has
the following properties:
Filesystem cipher: "ssl/aes", version 3:0:2
Filename encoding: "nameio/stream", version 2:1:2
Key Size: 256 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
File holes passed through to ciphertext.

Now you will need to enter a password for your filesystem.
You will need to remember this password, as there is absolutely
no recovery mechanism.  However, the password can be changed
later using encfsctl.

New Encfs Password: 
Verify Encfs Password: 

$ encfsctl info enc
Version 6 configuration; created by EncFS 1.9.1 (revision 20100713)
Filesystem cipher: "ssl/aes", version 3:0:0 (using 3:0:2)
Filename encoding: "nameio/stream", version 2:1:0 (using 2:1:2)
Key Size: 256 bits
Using PBKDF2, with 190307 iterations
Salt Size: 160 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
File holes passed through to ciphertext.

So this is the new configuration which (seems to) work.
What was the old configuration ? Paranoia mode certainly ?

Probably, as it seems like a sensible default initially. I don't remember as I set is up a while ago.

Another occurence of a DropBox issue : #384

It worked flawlessly during (weeks ? months ? years ?) and suddenly began to fail ?
Using the same EncFS version ?

No, it has always had this problem. I just used my laptop only for a long time, so I never saw the issue. I've had to recover files before, but nothing critical, so I just avoided / ignored the issue until this time.

I just figured out that Dropbox also has its own local cache.
Not sure how Dropbox uses it in the background, but sounds like it uses it for efficiency and emergency.
This could certainly make EncFS behave badly.

My guess is that Dropbox's rename detection behaves pathologically with EncFS's paranoia mode.

With paranoia mode, the encrypted content of a file depends on the file name and path. That means that EncFS has to re-encrypt the whole file (or, the whole directory contents, recursively) when you rename or move it. The inode number stays the same, and EncFS resets the modification time to the original value. It's not unreasonable for Dropbox to think that the encrypted file has been just renamed (new name for same content).

So when Dropbox syncs the "rename" to the other PC, you will have the old content with the new file name, and EncFS will try to decrypt that, and the result is garbage.

@rfjakob so that means in standard mode the encrypted contents depend only on the contents (and not the path / filename), so renaming should change the name and not the content then, letting Dropbox propagate the rename as it wants (with contents really staying the same)? I appreciate the FAQ addition, but don't quite understand how / if this would / should solve it. Thanks...

Yes, exactly, that's the idea. What you are using, configured through expert mode, is fine as well, as long as you have "external iv chaining" disabled.

So should --nocache (see the first few comments) be a recommendation for Dropbox use as well then (if so, please add to the FAQ), is it helpful, or is it irrelevant?

No, as finally the issue here seems to be DropBox cache, not Fuse cache.

Just to get back to this, I have been running Dropbox with encfs without the "external iv chaining" across two machines since August (with heavy work on the same tree on both machines), and have seen zero corruption (with daily checks on both machines as per the scripts I posted in comments 3 and 3). So this does indeed seem to be the fix.

Since someone asked by mail, I have had no data loss since this fix as of now (June 2019).