Backup Bliss

This entry was originally published in our developer’s wiki,  we’ve pasted it here to conserve mouse clicks.

 

Back it up here

So, internally we’ve found a need to do a lot of file transfer and encryption with off site backups. It took most of the day, but we’ve got a cool solution that works well, and we thought we’d share it with you.

Background and problem

  • I need to accept tons of transfers from a variety of clients in a secure way
  • I want to have these files backed up in a large device on my network
  • I want to make sure that these backups are stored encrypted
  • I also need to make sure that these backups are encrypted remotely (via Amazon S3)

Our Solution

  • Secure transfer via sftp + rssh
  • Use Ubuntu’s encfs as the encryption method
  • Use Jungledisk and rsync to backup the stored files to S3

Hey, you think you’re so cool, using standard tools why the need to write a wiki on this?

Not everything works out of the box…what to do?

sftp+rssh+chroot == wtf?!?”’

For something that purports to be secure, we found one glaring nuisance (not an outright security flaw), but something that would raise eyebrows. When you use chroot to make your jail for your sequestered / directory, you still need to grant your users for the limited scp and sftp execution access to the entire (limited) filesystem. That includes the /home directory. So if your user testuser logs in, and does a chdir .. – it will be able to scan through and see if there are other clients that this server is feeding to. Definetaly a big no-no in our book. Some forums we ran across raised this issue, but we found that they went unanswered. Since disk space is cheap, we ultimately made a new chroot jail for each customer, so to speak.

We largely followed the directions found at this ubuntu forum entry, but with the modification that you’ll need to make run chroot.sh <jail location> for each jail you want to setup.

Then for each user you add, you will need to modify the /etc/password accordingly to set the home directory of the user to the correct jail root.

In the end, we ultimately put these entries directly at the end of the /etc/rssh.conf – the per-user options block.

user=user1:011:00011:"/home/jail1"
user=user2:011:00011:"/home/jail2"

Deciphering Encryption

Ubuntu’s latest releases come with two different flavors of folder encryption. encfs and ecryptfs. We chose to use encfs because it gives a bit more control over the location and more importantly the multiple ways in which one can mount the folder. According to this blog, performance between the two seems negligible.

encfs allows you store data in an encrypted folder, let’s call it /var/encrypteddata. In order to manipulate it, you need to mount it using

encfs /var/encrypteddata ~/letmesee

Now, if i look at ~/letmesee, i can see all the correct filenames and can read/write to it. If i look at /var/encrypteddata, the filenames are gibberish as are the contents.

Another cool part is that since the directories are locked with a passphrase, I can mount it across the network so long as I have encfs running on my machine. Of note, I had to make sure that the versions of encfs were at least the same version. I could not mount a volume encrypted on version 1.4.2 (Intrepid Ibex) on the default 1.3.2 (Hardy Heron).

Why not truecrypt?

This program intrigues us greatly, and we use it in other contexts, but for our needs, we found that it might have caused more issues, so ultimately we didn’t implement our solution with it.

The issues:

  • truecrypt needs to either allocate a big block of a file as the encrypted store
  • or it needs to mount an entire drive

The result of these two directives made it difficult to do a few things:

  • We wanted more flexibility with just dumping data to a directory and not have to allocate the storage ahead of time. N
  • We wanted also to send incremental backups of the data to S3. We saw that either we would have to decrypt the data out of the volume and deposit it to S3 and re-encrypt it (higher CPU cost, decrypt->recrypt, but that’s not necessarily a bad thing). Encfs let us just do a direct copy of the encrypted individual files to S3.

Implementing encfs So, ultimately for our solution, we attached our NAS to our machine running the sftp+rssh server. And what we set as the home directories of our jailed/chroot-ed users, the visible endpoint of an encfs mount. I had some trouble but found that the fusermount option for multiple users makes for the mounted directory being visible to the rssh’ed user on sftp. This page got me started on the road to using encfs.

To reiterate:

encfs -o allow_other /mnt/nasdrive/encrypted /home/user1/data-to-encrypt
rssh user1 home dir: /home/jail1/home/user1/data-to-encrypt

 

Note, the -o allow_other flag is the fuse option to allow your jailed user to have access the folder mounted by root

Jungledisk is awesome

There, I said it. Using the linux command line tool, we mounted our s3 filesystem with the fstab entry:

jungledisk  /mnt/s3  fuse  noauto,config=/etc/jungledisk-settings.xml 0 0

If you get any weird errors on trying to write files to the s3 mount, make sure that the tmp directory is actually viable. I copied over the xml settings verbatim from windows and it gave me two problems. One, the amazon secret key is not stored by default in the windows file, plus it has a temp directory setting to some win32 specific filepath.

After that, rsync is a breeze from getting the /mnt/nasdrive/encrypted directory to somewhere in s3. The advantage here is that we’re just uploading the already encrypted data in place. Because it’s encrypted on a per file basis, we can run rsync (or just cp for that matter) to just do a delta of files that do or do not exist (instead of say over a huge block of a truecrypt volume).

Share

Tags

Similar Articles

The World's Most Powerful Mobile Data Collection Platform

Start a FREE 30-day CommCare trial today. No credit card required.

Get Started

Learn More

Get the latest news delivered
straight to your inbox