dm-crypt

How to setup an encrypted USB-disk software-RAID-1 on Debian GNU/Linux using mdadm and cryptsetup

This is what I set up for backups recently using a cheap USB-enclosure which can house 2 SATA disks and shows them as 2 USB mass-storage devices to my system (using only one USB cable). Without any further introduction, here goes the HOWTO:

First, create one big partition on each of the two disks (/dev/sdc and /dev/sdd in my case) of the exact same size. The cfdisk details are omitted here.

  $ cfdisk /dev/sdc
  $ cfdisk /dev/sdd

Then, create a new RAID array using the mdadm utility:

  $ mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd1

The array is named md0, consists of the two devices (--raid-devices=2) /dev/sdc1 and /dev/sdd1, and it's a RAID-1 array, i.e. data is simply mirrored on both disks so if one of them fails you don't lose data (--level=1). After this has been done the array will be synchronized so that both disks contain the same data (this process will take a long time). You can watch the current status via:

  $ cat /proc/mdstat
  Personalities : [raid1]
  md0 : active raid1 sdd1[1] sdc1[0]
        1465135869 blocks super 1.1 [2/2] [UU]
        [>....................]  resync =  0.0% (70016/1465135869) finish=2440.6min speed=10002K/sec
  unused devices: 

Some more info is also available from mdadm:

  $ mdadm --detail --scan
  ARRAY /dev/md0 metadata=1.01 name=foobar:0 UUID=1234578:1234578:1234578:1234578

  $ mdadm --detail /dev/md0
  /dev/md0:
          Version : 1.01
    Creation Time : Sat Feb  6 23:58:51 2010
       Raid Level : raid1
       Array Size : 1465135869 (1397.26 GiB 1500.30 GB)
    Used Dev Size : 1465135869 (1397.26 GiB 1500.30 GB)
     Raid Devices : 2
    Total Devices : 2
      Persistence : Superblock is persistent
      Update Time : Sun Feb  7 00:03:21 2010
            State : active, resyncing
   Active Devices : 2
  Working Devices : 2
   Failed Devices : 0
    Spare Devices : 0
   Rebuild Status : 0% complete
             Name : foobar:0  (local to host foobar)
             UUID : 1234578:1234578:1234578:1234578
           Events : 1
      Number   Major   Minor   RaidDevice State
         0       8       33        0      active sync   /dev/sdc1
         1       8       49        1      active sync   /dev/sdd1

Next, you'll want to create a big partition on the RAID device (cfdisk details omitted)...

  $ cfdisk /dev/md0

...and then encrypt all the (future) data on the device using dm-crypt+LUKS and cryptsetup:

  $ cryptsetup --verbose --verify-passphrase luksFormat /dev/md0p1
  Enter your desired pasphrase here (twice)
  $ cryptsetup luksOpen /dev/md0p1 myraid

After opening the encrypted container with cryptsetup luksOpen you can create a filesystem on it (ext3 in my case):

  $ mkfs.ext3 -j -m 0 /dev/mapper/myraid

That's about it. In future you can access the RAID data by using the steps below.

Starting the RAID and mouting the drive:

  $ mdadm --assemble /dev/md0 /dev/sdc1 /dev/sdd1
  $ cryptsetup luksOpen /dev/md0p1 myraid
  $ mount -t ext3 /dev/mapper/myraid /mnt

Shutting down the RAID:

  $ umount /mnt
  $ cryptsetup luksClose myraid
  $ mdadm --stop /dev/md0

That's all. Performance is shitty due to all the data being shoved out over one USB cable (and USB itself being too slow for these amounts of data), but I don't care too much about that as this setup is meant for backups, not performance-critical stuff.

Update 04/2011: Thanks to Bohdan Zograf there's a Belorussian translation of this article now!

Note to self: Missing lvm2 and cryptsetup packages lead to non-working initrd very, very soon

I recently almost died from a heart attack because after a really horrible crash (don't ask), Debian unstable on my laptop wouldn't boot anymore. The system hung at "Waiting for root filesystem...", and I was in panic mode as I feared I lost all my data (and as usual my backups were waaay too old).

At first I was suspecting that something actually got erased or mangled due to the crash, either at the dm-crypt layer, or the LVM layer, or the ext3 filesystem on top of those. After various hours of messing with live CDs, cryptsetup, lvm commands (such as pvscan, pvs, vgchange, vgs, vgck) and finally fsck I still had not managed to successfully boot my laptop.

I finally was able to boot by changing the initrd from initrd.img-2.6.30-2-686 to initrd.img-2.6.30-2-686.bak in the GRUB2 menu (at boot-time), at which point it was clear that something was wrong with my current initrd. A bit of debugging and some initrd comparisons revealed the cause:

Both, the cryptsetup and lvm2 packages were no longer installed on my laptop, which made all update-initramfs invokations (e.g. upon kernel package updates) create initrds which did not contain the proper dm-crypt and lvm functionality support. Hence, no booting for me. I only noticed because of the crash, as I usually do not reboot the laptop very often (two or three times per year maybe).

Now, as to why those packages were removed I have absolutely no idea. I did not remove them knowingly, so I suspect some dist-upgrade did it and I didn't notice (but I do carefully check which packages dist-upgrade tries to remove, usually)...

Resizing ext3-on-LVM-on-dmcrypt file systems, moving disk space from one LV to another

Back in 2008 I wrote a small article about resizing LVM physical volumes. I had to do something similar, but slighly more complicated, recently. My /usr logical volume (LV) was getting full on my laptop disk, thus I wanted to shrink another LV and move some of that space to /usr. Here's one way you can do that.

Requirements: a Live CD containing all required utilities (cryptsetup, LVM tools, resize2fs), I used grml.

Important: If you plan to perform any of these steps, make sure you have recent backups! I take no responsibility for any data loss you might experience. You have been warned!

First, shutdown the laptop and boot using the Live CD. Then, open the dm-crypt device (/dev/hda3 in my case) by entering your passphrase:

  $ cryptsetup luksOpen /dev/hda3 foo

Activate all (newly available) LVM volume groups in that encrypted device:

  $ vgchange -a y

(maybe you also need a vgscan and/or lvscan, not sure)

Check how much free space we have for putting into our /usr LV:

  $ vgdisplay | grep Free
  Free  PE / Size       0 / 0   

OK, so we have none. Thus, we need to shrink another LV (/home, in my case) and put that newly freed space into the /usr LV. In order to do that, we have to check the current size of the /home LV:

  $ mount -t ext3 /dev/vg-whole/lv-home /mnt
  $ df --block-size=1M | grep -C 1 /mnt
  $ umount /mnt

(if you know how to find out the size of an ext3 file system without mounting it, please let me know) Update: See comments for suggestions.

Write down the total amount of 1M chunks of space on the file system (116857 in my case), we'll need that later. Now run 'fsck' on the /home LVM logical volume, which is needed for the 'resize2fs' step afterwards. This will take quite a while.

  $ fsck -f /dev/vg-whole/lv-home

Next step is resizing the ext3 file system in the /home LVM logical volume, making it 1GB smaller than before (of course you must have >= 1 GB of free space on /home for that to work). We use fancy bash calculations to do the math.

Note: I'm not so sure about the sizes here, in my first attempt something went wrong and resize2fs said "filesystem too small" or the like. Maybe I'm confusing the size units from 'df' and 'resize2fs', or the bash calculation goes wrong? Please leave a comment if you know more!

  $ resize2fs /dev/vg-whole/lv-home $((116857-1024))M

Then, we can safely reduce the LV itself. Note: order is very important here, you must shrink the ext3 filesystem first, and then shrink the LV! Doing it the other way around will destroy your filesystem!

  $ lvreduce -L -1G /dev/vg-whole/lv-home

Now that we have 1 GB of free space to spend on LVs, we assign that space to the /usr LVM logical volume like this:

  $ lvextend -L +1G /dev/vg-whole/lv-usr

As usual, we then run 'fsck' on the filesystem in order to be able to use 'resize2fs' to resize it to the biggest possible size (that's the default if resize2fs gets no parameters):

  $ fsck -f /dev/vg-whole/lv-usr
  $ resize2fs /dev/vg-whole/lv-usr

That's it. You can now shutdown the Live CD system and boot into the normal OS with the new space allocations:

  $ vgchange -a n
  $ cryptsetup luksClose foo
  $ halt

Recovering from a dead disk in a Linux software-RAID5 system using mdadm

RAID5 failure

As I wrote quite a while ago, I set up a RAID5 with three
IDE disks at home, which I'm using as backup (yes, I know that
RAID != backup) and storage space.

A few days ago, the RAID was put to a real-life test for the first time, as one of the disks died. Here's what that looks like in dmesg:

raid5: raid level 5 set md1 active with 3 out of 3 devices, algorithm 2
RAID5 conf printout:
 --- rd:3 wd:3
 disk 0, o:1, dev:hda2
 disk 1, o:1, dev:hdg2
 disk 2, o:1, dev:hde2
[...]
hdg: dma_timer_expiry: dma status == 0x21
hdg: DMA timeout error
hdg: 4 bytes in FIFO
hdg: dma timeout error: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown
hdg: dma_timer_expiry: dma status == 0x21
hdg: DMA timeout error
hdg: 252 bytes in FIFO
hdg: dma timeout error: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown
hdg: dma_timer_expiry: dma status == 0x21
hdg: DMA timeout error
hdg: 252 bytes in FIFO
hdg: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdg: DMA disabled
ide3: reset: success
hdg: dma_timer_expiry: dma status == 0x21
hdg: DMA timeout error
hdg: 252 bytes in FIFO
hdg: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdg: DMA disabled
ide3: reset: success
hdg: status timeout: status=0x80 { Busy }
ide: failed opcode was: 0xea
hdg: drive not ready for command
hdg: lost interrupt
hdg: task_out_intr: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown
hdg: lost interrupt
hdg: task_out_intr: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown

That's when I realized that something was horribly wrong.

Not long after that, these messages appeared in dmesg. As you can see the software-RAID automatically realized that a drive died and removed the faulty disk from the array. I did not lose any data, and the system did not freeze up; I could continue working as if nothing happened (as it should be).

 md: super_written gets error=-5, uptodate=0
 raid5: Disk failure on hdg2, disabling device.
 raid5: Operation continuing on 2 devices.
 RAID5 conf printout:
  --- rd:3 wd:2
  disk 0, o:1, dev:hda2
  disk 1, o:0, dev:hdg2
  disk 2, o:1, dev:hde2
 RAID5 conf printout:
  --- rd:3 wd:2
  disk 0, o:1, dev:hda2
  disk 2, o:1, dev:hde2

This is how you can check the current RAID status:

 $ cat /proc/mdstat
 Personalities : [raid6] [raid5] [raid4] 
 md1 : active raid5 hda2[0] hde2[2] hdg2[3](F)
       584107136 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]

The "U_U" means two of the disks are OK, and one is faulty/removed. The desired state is "UUU", which means all three disks are OK.

The next steps are to replace the dead drive with a new one, but first you should know exactly which disk you need to remove (in my case: hda, hde, or hdg). If you remove the wrong one, you're screwed. The RAID will be dead and all your data will be lost (RAID5 can survive only one dead disk at a time).

The safest way (IMHO) to know which disk to remove is to write down the serial number of the disk, e.g. using smartctl, and then check the back side of each disk for the matching serial number.

 $ smartctl -i /dev/hda | grep Serial
 $ smartctl -i /dev/hde | grep Serial
 $ smartctl -i /dev/hdg | grep Serial

(ideally you should get the serial numbers before one of the disks dies)

Now power down the PC and remove the correct drive. Get a new drive which is at least as big as the one you removed. As this is software-RAID you have quite a lot of flexibility; the new drive doesn't have to be from the same vendor / series, it doesn't even have to be of the same type (e.g. I got a SATA disk instead of another IDE one).

Insert the drive into some other PC in order to partition it correctly (e.g. using fdisk or cfdisk). In my case I needed a 1 GB /boot partition for GRUB, and the rest of the drive is another partition of the type "Linux RAID auto", which the software-RAID will then recognize.

Then, put the drive into the RAID PC and power it up. After a successful boot (remember, 2 out of 3 disks in RAID5 are sufficient for a working system) you'll have to hook-up the new drive into the RAID:

 $ mdadm --manage /dev/md1 --add /dev/sda2
 mdadm: added /dev/sda2

My new SATA drive ended up being /dev/sda2, which I added using mdadm. The RAID immediately starts restoring/resyncing all data on that drive, which may take a while (2-3 hours, depends on the RAID size and some other factors). You can check the current progress with:

 $ cat /proc/mdstat 
 Personalities : [raid6] [raid5] [raid4] 
 md1 : active raid5 sda2[3] hda2[0] hde2[2]
       584107136 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]
       [>....................]  recovery =  0.1% (473692/292053568) finish=92.3min speed=52632K/sec

As soon as this process is finished you'll see this in dmesg:

 md: md1: recovery done.
 RAID5 conf printout:
  --- rd:3 wd:3
  disk 0, o:1, dev:hda2
  disk 1, o:1, dev:sda2
  disk 2, o:1, dev:hde2

In /proc/mdstat you'll see "UUU" again, which means your RAID is fully functional and redundant (with three disks) again. Yay.

 $ cat /proc/mdstat
 Personalities : [raid6] [raid5] [raid4] 
 md1 : active raid5 sda2[1] hda2[0] hde2[2]
       584107136 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

Btw, another nice utility you might find useful is hddtemp, which can check the temperature of the drives. You should take care that they don't get too hot, especially so if the RAID runs 24/7.

 $ hddtemp /dev/hda
 dev/hda: SAMSUNG HD300LD: 38 °C
 $ hddtemp /dev/hde
 dev/hde: SAMSUNG HD300LD: 44 °C
 $ hddtemp /dev/sda
 dev/sda: SAMSUNG HD322HJ: 32 °C

Speed up Linux crypto operations on the One A110 laptop with VIA Padlock

One Mini A110 subnotebook

OK, so I've been hacking on and testing my shiny new One A110 mini-laptop during the last few days and I must say I'm very happy with it. I'll write up some more details later (check the wiki if you're impatient), but today I want to highlight a very nice feature of this laptop (compared to, for instance, the Eee PC): The VIA C7-M ULV CPU in the laptop has VIA Padlock support.

VIA Padlock is a hardware feature in recent VIA CPUs which provides hardware-accelerated AES and SHA-1/SHA-256 support, among other things. This can be used in Linux (with the proper drivers and patches) to improve performance of dm-crypt, OpenSSL (and all programs using it), scp, sha1sum, OpenVPN, etc. etc.

I have written a quite extensive VIA Padlock HOWTO and benchmarks in the A110 wiki (but all of this will work on other systems which have VIA Padlock, too). To summarize, here are the most important benchmarks:

dm-crypt (256bit AES, cbc-essiv:sha256)

VIA Padlock dm-crypt benchmark

Without VIA Padlock support:

$ hdparm -tT /dev/mapper/hdc2_crypt
/dev/mapper/hdc2_crypt:
 Timing cached reads:   448 MB in  2.00 seconds = 223.47 MB/sec
 Timing buffered disk reads:   22 MB in  3.07 seconds =   7.17 MB/sec

With VIA Padlock support:

$ hdparm -tT /dev/mapper/hdc2_crypt
/dev/mapper/hdc2_crypt:
 Timing cached reads:   502 MB in  2.00 seconds = 250.41 MB/sec
 Timing buffered disk reads:   90 MB in  3.07 seconds =  29.36 MB/sec

The native speed of the SSD in the laptop is 31.01 MB/sec, so there is almost no performance penalty when using VIA Padlock.

OpenSSL

VIA Padlock OpenSSL benchmark

OpenSSL speed benchmark, first line without Padlock, second line with Padlock enabled:

$ openssl speed -evp aes-256-cbc [-engine padlock]
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc       9187.18k    10572.28k    11054.32k    11179.36k    11218.02k
aes-256-cbc      47955.92k   150619.73k   325730.73k   458320.11k   520520.79k

ssh/scp

VIA Padlock scp benchmark

Without VIA Padlock support:

$ scp -c aes256-cbc bigfile.dat localhost:/dev/null
bigfile.dat                100%  159MB   5.9MB/s   00:27

With VIA Padlock support:

$ scp -c aes256-cbc bigfile.dat localhost:/dev/null
bigfile.dat                100%  159MB  14.5MB/s   00:11

OpenVPN

A real speed benchmark is pending (not measurable easily on 100MBit LAN, will try on a slower link), but as OpenVPN uses OpenSSL it should have roughly the same speedup iff you tell OpenVPN to use AES (it uses Blowfish per default).

Also, there's a measurable difference in CPU load while tranferring large files over OpenVPN: 8% CPU load with VIA Padlock (vs. 20% CPU load without VIA Padlock).

sha1sum / phe_sum

VIA Padlock sha1sum / phe_sum benchmark

phe_sum is a small C program which can be used as drop-in replacement for sha1sum (which doesn't support VIA Padlock yet). Quick benchmark:

sha1sum, without VIA Padlock:

$ time sha1sum bigfile.dat
real    0m6.511s
user    0m5.864s
sys     0m0.412s

phe_sum (with VIA Padlock support):

$ time ./phe_sum bigfile.dat
real    0m1.149s
user    0m0.704s
sys     0m0.424s

All in all VIA Padlock gives you a pretty impressive speedup for many crypto-using applications on Linux, which is especially useful on the A110 mini-laptop (think OpenVPN or scp for mobile usage, and dm-crypt for an encrypted SSD, of course).

Syndicate content