Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss with Dropbox #388

Closed
internationils opened this issue Aug 19, 2017 · 25 comments
Closed

Data loss with Dropbox #388

internationils opened this issue Aug 19, 2017 · 25 comments

Comments

@internationils
Copy link

internationils commented Aug 19, 2017

Hello,
I have been having issues using encfs with dropbox for a while, and finally dug down to try and understand when they surface. https://superuser.com/questions/949066/input-output-errors-using-encfs-folder-inside-dropbox-folder sums it up pretty well.
Basically, having a dropbox folder with encfs inside on 2 different hosts (Debian/Ubuntu in my case) can cause files to become unreadable for one host, the other host, or both (data loss). There are partial remedies, and the fix is apparently connected to running without a path-based IV (as I understand it).
The remedy (see the scripts in the next post) is scanning for input/output errors on one host, and then on the other host moving the files out of and back into the encfs (and vice versa).
The fix is (according to the SU reply):

If you are running encfs in the "maximum security" mode or you have enabled "filename to IV header chaining" in will break on any Dropbox-like service. Don't enable it. Actually, don't ever use it, it's just plain stupid to rely upon the file path for the file data encryption IV.
I would use "stream" filename encoding and only "per-file initialization vectors" and "File holes passed through to ciphertext" features to make encfs reliable.

I would suggest either a) figuring out what is going on and fixing it (i.e. run 2 hosts which create and modify files in the same dropbox/encfs FS, and hourly scan for errors), or at least b) add this to the FAQ somewhere.

@internationils
Copy link
Author

#!/bin/sh
# find-corruption.sh
Date=`/bin/date -Iminutes`
Path=`/bin/pwd | /usr/bin/awk '{gsub("/","_",$0)}1'`
Host=`hostname`
Tmpfile="./dropbox-currupt-$Host.txt"
Destfile="./corrupt-$Host$Path-$Date.txt"
/usr/bin/find . -exec /usr/bin/file '{}' \; | /bin/grep "output error" > $Tmpfile
/bin/cat $Tmpfile | /usr/bin/awk -F  ":" '{print $1}' > $Destfile
/usr/bin/wc $Destfile
if [ ! -s $Destfile ] ; then
   echo Removing zero length $Destfile
   rm $Destfile
fi

@internationils
Copy link
Author

#!/bin/bash
# fix-corruption.sh
# TODO: cannot interrupt the script, as the file remains in /tmp/crap ... 

echo $0 called with $# arguments
if [ $# -ne 1 ]; then
    echo "illegal number of parameters"
    exit;
fi

filelist=$1
if [ ! -r $filelist ]; then
   echo "ERROR: $filelist is not readable!"
   exit;
fi
DIR=$(dirname "${filelist}")

# IFS == input field separator, should be set to empty for the duration of this command
# "IFS= " (with space) sets IFS to empty
# putting it on the same line as « read » means that it's temporary to that one command

while IFS= read -r corrupted <&3; do
    # check if file is corrupted on this machine as well
    if /usr/bin/file "$corrupted" >/dev/null 2>&1 ; then
        echo FIXING: $corrupted
        #if not, fix it
        mv "$corrupted" /tmp/crap
        sleep 5
        mv /tmp/crap "$corrupted"
        sleep 1
    else
        #if it is corrupt here as well, skip it
        echo $corrupted >> $DIR/remainingCorruptedFiles.txt
        echo BROKEN: $corrupted corrupted on this host as well
    fi;
done 3<$filelist
rm $filelist

@benrubson
Copy link
Contributor

Do both computers mount with --nocache ?

@internationils
Copy link
Author

No, neither does.

@benrubson
Copy link
Contributor

They should if both access the EncFS files at the same time :
This makes sure that modifications to the backing files that occour outside EncFS show up immediately in the EncFS mount.

@internationils
Copy link
Author

Well, the dropbox on both is running at the same time, but the access to the unencrypted files is from one OR the other only (one is a desktop, one is a laptop for travel). I'll set it on both though, thanks.

@benrubson
Copy link
Contributor

If you mount one at a time, not both at the same time, this option will then not help.

@internationils
Copy link
Author

They are both mounted most of the time so I can work at home or on the road on the same files without having to remember to sync, just not actively accessed by me at the same time (obviously).

@benrubson
Copy link
Contributor

Let then us know if --nocache helps 👍
But as the cache seems to generally be about 1 second of time, I'm not sure it will do the trick...

@internationils
Copy link
Author

internationils commented Aug 22, 2017

I'm not about to risk losing more files, sorry ;) I've set --nocache, and also followed the suggestion from the SU post (see my initial post here) about IV chaining and stream filename encoding - looking OK so far.

@benrubson
Copy link
Contributor

Really not sure however why IV chaining would cause such an issue, as the chain does not change (appart from the cache effect, this is why I proposed the --nocache option).

@internationils
Copy link
Author

Here's my setup, just to document that too.

### using ENCFS with DROPBOX: nonstandard configuration is needed!!!
$ /usr/bin/encfs --nocache -oallow_other /mnt/dropboxes/dropbox/Dropbox/enc /mnt/dropboxes/encmount
Linux blackbox 4.10.0-32-generic #36-Ubuntu SMP Tue Aug 8 12:10:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
mounting /mnt/dropboxes/dropbox/Dropbox/enc /mnt/dropboxes/encmount
Creating new encrypted volume.
Please choose from one of the following options:
 enter "x" for expert configuration mode,
 enter "p" for pre-configured paranoia mode,
 anything else, or an empty line will select standard mode.
?> x

Manual configuration mode selected.
The following cipher algorithms are available:
1. AES : 16 byte block cipher
 -- Supports key lengths of 128 to 256 bits
 -- Supports block sizes of 64 to 4096 bytes
2. Blowfish : 8 byte block cipher
 -- Supports key lengths of 128 to 256 bits
 -- Supports block sizes of 64 to 4096 bytes

Enter the number corresponding to your choice: 1

Selected algorithm "AES"

Please select a key size in bits.  The cipher you have chosen
supports sizes from 128 to 256 bits in increments of 64 bits.
For example: 
128, 192, 256
Selected key size: 256

Using key size of 256 bits

Select a block size in bytes.  The cipher you have chosen
supports sizes from 64 to 4096 bytes in increments of 16.
Or just hit enter for the default (1024 bytes)

filesystem block size: 

Using filesystem block size of 1024 bytes

The following filename encoding algorithms are available:
1. Block : Block encoding, hides file name size somewhat
2. Block32 : Block encoding with base32 output for case-insensitive systems
3. Null : No encryption of filenames
4. Stream : Stream encoding, keeps filenames as short as possible

Enter the number corresponding to your choice: 4

Selected algorithm "Stream""

Enable filename initialization vector chaining?
This makes filename encoding dependent on the complete path, 
rather then encoding each path element individually.
[y]/n: n

Enable per-file initialization vectors?
This adds about 8 bytes per file to the storage requirements.
It should not affect performance except possibly with applications
which rely on block-aligned file io for performance.
[y]/n: y

External chained IV disabled, as both 'IV chaining'
and 'unique IV' features are required for this option.
Enable block authentication code headers
on every block in a file?  This adds about 12 bytes per block
to the storage requirements for a file, and significantly affects
performance but it also means [almost] any modifications or errors
within a block will be caught and will cause a read error.
y/[n]: n

Add random bytes to each block header?
This adds a performance penalty, but ensures that blocks
have different authentication codes.  Note that you can
have the same benefits by enabling per-file initialization
vectors, which does not come with as great of performance
penalty. 
Select a number of bytes, from 0 (no random bytes) to 8: 

Enable file-hole pass-through?
This avoids writing encrypted blocks when file holes are created.
[y]/n: y

Configuration finished.  The filesystem to be created has
the following properties:
Filesystem cipher: "ssl/aes", version 3:0:2
Filename encoding: "nameio/stream", version 2:1:2
Key Size: 256 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
File holes passed through to ciphertext.

Now you will need to enter a password for your filesystem.
You will need to remember this password, as there is absolutely
no recovery mechanism.  However, the password can be changed
later using encfsctl.

New Encfs Password: 
Verify Encfs Password: 

$ encfsctl info enc
Version 6 configuration; created by EncFS 1.9.1 (revision 20100713)
Filesystem cipher: "ssl/aes", version 3:0:0 (using 3:0:2)
Filename encoding: "nameio/stream", version 2:1:0 (using 2:1:2)
Key Size: 256 bits
Using PBKDF2, with 190307 iterations
Salt Size: 160 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
File holes passed through to ciphertext.

@benrubson
Copy link
Contributor

So this is the new configuration which (seems to) work.
What was the old configuration ? Paranoia mode certainly ?

@internationils
Copy link
Author

Probably, as it seems like a sensible default initially. I don't remember as I set is up a while ago.

@benrubson
Copy link
Contributor

benrubson commented Aug 22, 2017

Another occurence of a DropBox issue : #384

@benrubson
Copy link
Contributor

It worked flawlessly during (weeks ? months ? years ?) and suddenly began to fail ?
Using the same EncFS version ?

@internationils
Copy link
Author

No, it has always had this problem. I just used my laptop only for a long time, so I never saw the issue. I've had to recover files before, but nothing critical, so I just avoided / ignored the issue until this time.

@benrubson
Copy link
Contributor

I just figured out that Dropbox also has its own local cache.
Not sure how Dropbox uses it in the background, but sounds like it uses it for efficiency and emergency.
This could certainly make EncFS behave badly.

@rfjakob
Copy link
Collaborator

rfjakob commented Aug 22, 2017

My guess is that Dropbox's rename detection behaves pathologically with EncFS's paranoia mode.

With paranoia mode, the encrypted content of a file depends on the file name and path. That means that EncFS has to re-encrypt the whole file (or, the whole directory contents, recursively) when you rename or move it. The inode number stays the same, and EncFS resets the modification time to the original value. It's not unreasonable for Dropbox to think that the encrypted file has been just renamed (new name for same content).

So when Dropbox syncs the "rename" to the other PC, you will have the old content with the new file name, and EncFS will try to decrypt that, and the result is garbage.

rfjakob added a commit to rfjakob/encfs-next that referenced this issue Aug 22, 2017
@internationils
Copy link
Author

@rfjakob so that means in standard mode the encrypted contents depend only on the contents (and not the path / filename), so renaming should change the name and not the content then, letting Dropbox propagate the rename as it wants (with contents really staying the same)? I appreciate the FAQ addition, but don't quite understand how / if this would / should solve it. Thanks...

@rfjakob
Copy link
Collaborator

rfjakob commented Aug 22, 2017

Yes, exactly, that's the idea. What you are using, configured through expert mode, is fine as well, as long as you have "external iv chaining" disabled.

@internationils
Copy link
Author

So should --nocache (see the first few comments) be a recommendation for Dropbox use as well then (if so, please add to the FAQ), is it helpful, or is it irrelevant?

@benrubson
Copy link
Contributor

No, as finally the issue here seems to be DropBox cache, not Fuse cache.

rfjakob added a commit to rfjakob/encfs-next that referenced this issue Aug 22, 2017
vgough pushed a commit that referenced this issue Aug 26, 2017
@internationils
Copy link
Author

Just to get back to this, I have been running Dropbox with encfs without the "external iv chaining" across two machines since August (with heavy work on the same tree on both machines), and have seen zero corruption (with daily checks on both machines as per the scripts I posted in comments 3 and 3). So this does indeed seem to be the fix.

@internationils
Copy link
Author

Since someone asked by mail, I have had no data loss since this fix as of now (June 2019).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants