Differences between revisions 9 and 10
Revision 9 as of 2013-07-16 13:21:20
Size: 7738
Editor: ?MarcusOsdoba
Comment: work on automation section
Revision 10 as of 2013-07-16 13:36:48
Size: 8492
Editor: ?MarcusOsdoba
Comment: du -sch /srv/dirvish/employee/*
Deletions are marked like this. Additions are marked like this.
Line 170: Line 170:
== Space consumption example ==
I created my dirvish setup when the source data were already filled with around 11GiB. By initalizing the vault I doubled the space needed. So you should start with an empty source directory. All future snapshots only need the space relative to the files that really changed from one snapshot to another:
{{{
# du -sch /srv/dirvish/employee/*
11G /srv/dirvish/employee/2013.06.23-22.04.02
88M /srv/dirvish/employee/2013.06.30-22.05.03
11M /srv/dirvish/employee/2013.07.01-22.04.02
11M /srv/dirvish/employee/2013.07.13-22.04.16
113M /srv/dirvish/employee/2013.07.14-22.05.20
11M /srv/dirvish/employee/2013.07.15-22.05.10
8,0K /srv/dirvish/employee/dirvish
11G insgesamt
}}}


Translation(s): none


This article explains how to enhance your samba shares with the shadow_copy2 module using dirvish.

What is shadow_copy

In conjunction with the stackable samba VFS module shadow_copy2 you can browse "Previous versions" through the Windows interface. By right clicking + properties on a file on the samba share you may find the tab "Previous versions" and roll back in time for a specific file or directory.

Alternatives to serve shadow copies

Motivation to use dirvish over other alternatives

  • space saving with hardlinks is comparable to LVM snapshots
  • many LVM snapshots degrade performance on LVM volumes
  • works within same filesystem without repartioning
  • save one indirection through the block device layer (works on plain filesystem -> somehow portable)

  • smooth deletion of older snapshots built-in with dirvish-expire and central config file

Setup dirvish for a samba share

Preparation

apt-get install samba dirvish

Setup your samba shares (authenticating against PDC or not, use winbind or not etc.) as usual. A samba share definition may look like this:

[employee]
  comment = Share for all employees
  writable = yes
  path = /srv/samba-shares/employee
  directory mask = 0770
  create mask    = 0660
  valid users    = @MYDOMAIN\employee
  force group    = employee

Dirvish configuration

Dirvish uses two terms you need to know of but there's no magic behind it. A "bank" is a simple directory and a place where dirvish stores "vaults". A vault is nothing more than a simple directory below a bank with some dirvish metadata in a simple text file. Dirvish ships with a default cron-job and a sample default.conf which could be used as template to create a vault.

master.conf

The dirvish:master.conf is located in /etc/dirvish. Mine looks similar to this:

bank:
  /srv/dirvish
#default excludes
Runall:
  employee
expire-default: +3 days
# expire rules
# speed-limit in megabit per second
speed-limit: 50

In this case "emloyee" is the vault which will be created and traited by dirvish during creation of new snapshots and deleting old ones according to the defined expiration rules.

create a vault

In this example the source data are located in /srv/samba-shares/employee. The vault has the same name.

# mkdir -p /srv/dirvish/employee/dirvish
# cp /usr/share/doc/dirvish/examples/default.conf.root /srv/dirvish/employee/dirvish/default.conf

You may study the default template by reading other literature now or adapt it to a content similar to the following one:

client: myhostname
tree: /srv/samba-shares/employee
xdev: 1
index: gzip
log: gzip
image-default: %Y.%m.%d-%H.%M.%S
# vault specific excludes

Let's initalize the vault with (typical error here is the incorrect hostname in default.conf):

# dirvish --init --vault=employee

Your done creating an inital image under /srv/dirvish/employee.

# ls -l /srv/dirvish/employee/
drwxr-xr-x 3 root employee 59 2013-06-23 22:49 2013.06.23-22.04.02
drwxr-xr-x 2 root root     44 2013-06-22 12:37 dirvish

Note: The data are in fact doubled. The original source under samba-shares/employee and the inital image for dirvish should consume the same size in the filesystem now. All further snapshots are based on this inital image with hardlinks (which don't consume extra space) and use only the amount of space for files that indeed changed over time.

Glue it together

Dirvish creates directories according to the defined pattern (e.g. 2013.07.12- 10.34.30) and put valuable date below it. The copied files itself reside in the subdirectory "tree". The shadow_copy2 module expects the directory in format @GMT-%Y.%m.%d-%H.%M.%S. So I created links in a third directory and linked to the corresponding tree subdir.

# ls -l /srv/snapshots/employee/
insgesamt 0
lrwxrwxrwx 1 root root 49 24. Jun 01:04 @GMT-2013.06.23-22.04.02 -> /srv/dirvish/employee/2013.06.23-22.04.02/tree
lrwxrwxrwx 1 root root 49  1. Jul 00:06 @GMT-2013.06.30-22.05.03 -> /srv/dirvish/employee/2013.06.30-22.05.03/tree

Additions to samba share definition

Finally, we need to tell the shadow_copy2 module where to find the subdirectories starting with "GMT-" which represent a former status of the original source data found in /srv/samba-shares/employee.

[...]
; needed for shadow_copy2
wide links      = yes
unix extensions = no
[...]
[employee]
[...]
vfs objects = shadow_copy2
shadow:snapdir = /srv/snapshots/employee
shadow:basedir = /srv/samba-shares/employee
[...]

Restart samba with  invoke-rc.d samba restart  and try to browse the samba share from Windows. Check the "Previous versions" tab.

automate it

Dirvish automatically expires old snapshots according to the expiration rules and it creates new snapshots regularly via the shipped cron-job. This cron-job traites every vault listen in master.conf in section "Runall:".

In order to let samba see these new snapshots (create links in our /srv/snapshot/employee directory) and disable expired ones (remove link in snapshot dir), I have written a simple script which does that. I added this script in the default.conf of the vault using the post-server directive:

post-server: /root/dirvish-shadowcopy2.sh employee

#
# may be used in post-server: directive of default.conf
#

DIRVISH_BANK=/srv/dirvish
VAULT=${1:-myvaultname}

SNAPSHOT_BASEDIR=/srv/snapshots

files=$(find $DIRVISH_BANK/$VAULT -maxdepth 1 -mindepth 1 -type d -regex ".*/[0-9.-]+.*")

#create new links
for file in $files ; do
  LINKNAME=@GMT-${file##*/}
  if [ ! -h $SNAPSHOT_BASEDIR/$VAULT/$LINKNAME ]; then
    echo "create link $LINKNAME in dir $SNAPSHOT_BASEDIR/$VAULT to $file/tree"
    ln -s $file/tree $SNAPSHOT_BASEDIR/$VAULT/$LINKNAME
  fi
done

#delete broken links (orignal removed by dirvish-expire)
find $SNAPSHOT_BASEDIR/$VAULT/ | while read line; do
  if [ -L "$line" -a ! -e "$line" ]; then
   echo "remove obsolete link: $line";
   rm $line
  fi;
done

#EOF

summary of configuration files

For the setup described here, there aren't severe changes on files originally shipped by Debian.

  • dirvish:master-conf: your bank(s), your vaults (optional: your expiration rules, maybe other dirvish directives)
  • dirvish:default.conf: one config file per vault
  • samba:smb.conf: add three plus two lines to activate shadow_copy2 as shown above
  • dirvish:/etc/cron.d/dirvish: not changed at all

Optional Tweaks and Pitfalls

  • Put your files on an XFS filesystem and use project_quotas to control space consumption
  • The snapshot directories need to be readable by the accessing users (otherwise you will have an empty list in the "Previous version" tab) - keep this in mind when you change the permissions for your share
  • In an earlier trial, I let dirvish create directories starting with "@GMT...". Later the cron-job script didn't work (I guess, somewhere you need to put the argument in quotes). I skipped the "@GMT.." part and let the symlinks to the tree subdirectory start with it. The cron-job scripts from dirvish work well without that prefix.

Space consumption example

I created my dirvish setup when the source data were already filled with around 11GiB. By initalizing the vault I doubled the space needed. So you should start with an empty source directory. All future snapshots only need the space relative to the files that really changed from one snapshot to another:

# du -sch /srv/dirvish/employee/*
11G     /srv/dirvish/employee/2013.06.23-22.04.02
88M     /srv/dirvish/employee/2013.06.30-22.05.03
11M     /srv/dirvish/employee/2013.07.01-22.04.02
11M     /srv/dirvish/employee/2013.07.13-22.04.16
113M    /srv/dirvish/employee/2013.07.14-22.05.20
11M     /srv/dirvish/employee/2013.07.15-22.05.10
8,0K    /srv/dirvish/employee/dirvish
11G     insgesamt

Resources