Archive for August, 2010

Nearline backup, using cheep storage and ZFS via FUSE

Posted in HowTo on 16/08/2010 by Undersys

I wanted to have an automated backup system for my PC, normally I use blu-ray RW each week to copy my stuff to, this is annoying as it requires me to do something.. something that I can automate.

I recently brought a 880gb USB2 drive for a real cheep price. I did not want another power brick hence the USB powered also theres no need for speed for this project.

One need is that the drive  stores a lot of my backups for a long time. Given that my data been backed up will only change a little it gave way for me to think about data deduplication currently ZFS is the only stable free-sih (CDDL)  file system that supports data deduplication. Also currently the CDDL conflicts with the GPL license so no kernel support for ZFS. Instead ZFS is supported by FUSE.

Normally I avoid fuse, the whole idea of a file system in user space annoys me but anyway…
For the sole purpose of backups we shall let it be.

Lets get ZFS :-
# emerge -av sys-fs/zfs-fuse ( you may need to unmask this)

Lets get rid of any partitions, as I will use the whole disk:-
# fdisk /dev/sdb
hit ‘d’ then ‘w’

I also wanted the system to leave my USB drive alone, eg don’t auto anything!
To achieve this I made use of HAL.

First lets find out HAL calls the USB2 drive I have:-
# hal-find-by-property –key block.device –string /dev/sdb
For me this returned :-
/org/freedesktop/Hal/devices/volume_part_1_size_888183152128
/org/freedesktop/Hal/devices/storage_serial_Seagate_FreeAgent_Go_2GE94FPL_0_0

I want to match a string based on file system type so I selected the HAL object
“/org/freedesktop/Hal/devices/volume_part_1_size_888183152128”

Lets query the HAL object for a string we can match:-
# lshal –show /org/freedesktop/Hal/devices/volume_part_1_size_888183152128
I selected the string “volume.fstype” as I do not want HAL to touch any ZFS file systems, you could pick other strings its up to you and your needs.

Now lets make a local HAL policy for this:-
# nano /etc/hal/fdi/policy/10-zfs-fuse.fdi
Enter the following into the file
“<?xml version=”1.0″ encoding=”UTF-8″?>
<deviceinfo version=”0.2″>
<device>
<match key=”volume.fstype” string=”zfs”>
<merge key=”volume.ignore” type=”bool”>true</merge>
</match>
</device>
</deviceinfo>”

Reference (1)

Restart the HAL daemon, after saving the file.
# /etc/init.d/hald restart

As we are using a USB drive theres a potential that it may receive random dev nodes say sdc or sdb. Lets use udev to abstract that away to something we can always use as a device path.

I created the following local udev rule, that will create a symlink to the real device node to my known set path, it will also halt the processing of other udev rules if they match.
# nano /etc/udev/rules.d/10-local.rules
I put the following in this file
“KERNEL==”sd*”, SUBSYSTEMS==”block”, ATTR{size}==”1734732719″ SYMLINK:=”dsk/zfs/backups””
In short, match any sd device that uses block mode and matches the size, then symlink to /dev/dsk/zfs/backups and stop processing udev rules.
I picked size as a string to match, but you can use any string from the parent and strings from the child objects if you need to be more exact in your match.

I used the following command to find a set of strings to match to :-
# udevadm info -a -p /sys/block/sdb/

Reference (2)

Now that the USB disk will do what I want, lets create a ZFS filesystem.

To create a top level ZFS file system and initial pool :-
# zfs create backups /dev/dsk/zfs/backups ( in this example “backups” is the zpool name and “/dev/dsk/zfs/backups” is the dev node)
enable data dedup
# zfs set dedup=on backups

To remove the USB drive and unmount the ZFS file system we must export it first:-
# zpool export backups

To reattach the usb storage we need to import, as we are not using a standard device node path we need to use the ‘d’ option.
# zpool import -d /dev/dsk/zfs/ backups

See the respective man pages for ‘zfs’ and ‘zpool’ for information on the above commands.

I will soon write a script that will take care of my backups and the use of ZFS.

Reference:-
1)
http://webcvs.freedesktop.org/hal/hal/doc/spec/hal-spec.html?view=co#ov_hal_linux26
http://wiki.archlinux.org/index.php/HAL

2)
http://www.reactivated.net/writing_udev_rules.html
http://www.gentoo.org/doc/en/udev-guide.xml