HOWTO: Ceph as Openstack nova-volumes/swift backend on Debian GNU/Linux wheezy

This howto is not finish yet, don't use it for now

This howto aims to provide guidelines to install Ceph and to use Ceph as nova-volumes/swift backends for Openstack.

The environment includes the following software:

Three Ceph nodes <cephX> with

A Openstack “proxy” or "management" node <host.mgt> And one or more pure Openstack “compute” <host.compute>

All Openstack nodes have been already set up with the OpenStack Howto or the Puppet based HOWTO

DOCUMENT CONVENTIONS

In formatted blocks :

PREREQUISITES

Things to prepare beforehand :

Technical Choices

Only OSD and MON daemon of Ceph will be configured because the Metadata one (MDS) is only needed when using CephFS This configuration intent to show

Installation

CephX nodes

On *each Ceph nodes* do:

# apt-get install -y ceph 

Configuring ceph (/etc/ceph/ceph.conf) with this following configuration:

[global]
    auth supported = cephx
    keyring = /etc/ceph/keyring.admin

[osd]
    osd data = /srv/ceph/osd$id
    osd journal = /srv/ceph/osd$id/journal
    osd journal size = 512
    keyring = /etc/ceph/keyring.$name

    ; working with ext4 (sileht: disable because xfs is used)
    ;filestore xattr use omap = true

    ; solve rbd data corruption (sileht: disable by default in 0.48)
    filestore fiemap = false

[osd.11]
    host = ceph1
    osd addr = 10.X.X.1
    devs = /dev/sda2
[osd.12]
    host = ceph1
    osd addr = 10.X.X.1
    devs = /dev/sdb2

[osd.21]
    host = ceph2
    osd addr = 10.X.X.2
    devs = /dev/sda2
[osd.22]
    host = ceph2
    osd addr = 10.X.X.2
    devs = /dev/sdb2

[osd.31]
    host = ceph3
    osd addr = 10.X.X.3
    devs = /dev/sda2
[osd.32]
    host = ceph3
    osd addr = 10.X.X.3
    devs = /dev/sdb2


[mon]
    mon data = /srv/ceph/mon$id
[mon.1]
    host = ceph1
    mon addr = 10.X.X.1:6789
[mon.2]
    host = ceph2
    mon addr = 10.X.X.2:6789
[mon.3]
    host = ceph3
    mon addr = 10.X.X.3:6789

Prepare the fstab for matching ceph.conf file by adding the following line for node cephX

/dev/sda2       /srv/ceph/osdX1  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0
/dev/sdb2       /srv/ceph/osdX2  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0

for ceph1 you have:

/dev/sda2       /srv/ceph/osd11  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0
/dev/sdb2       /srv/ceph/osd12  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0

for ceph3 you have:

/dev/sda2       /srv/ceph/osd21  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0
/dev/sdb2       /srv/ceph/osd22  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0

for ceph3 you have:

/dev/sda2       /srv/ceph/osd31  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0
/dev/sdb2       /srv/ceph/osd32  xfs rw,noexec,nodev,noatime,nodiratime,barrier=0   0   0

Create the mount point on each nodes:

on ceph1:

mkdir -p /srv/ceph/{mon1,osd1{1,2}}
mkdir -p /srv/ceph/{mon1,osd1{1,2}}

on ceph2:

mkdir -p /srv/ceph/{mon2,osd2{1,2}}
mkdir -p /srv/ceph/{mon2,osd2{1,2}}

on ceph3:

mkdir -p /srv/ceph/{mon3,osd3{1,2}}
mkdir -p /srv/ceph/{mon3,osd3{1,2}}

Next steps are only on one node of your choice

# Ensure you don't need password for ssh between nodes:

# ssh cephX uname -a
Linux cephX.domain.ltd 3.2.0-3-amd64 #1 SMP Thu Jun 28 09:07:26 UTC 2012 x86_64 GNU/Linux

if not, create a ssh keys pair and send it to all nodes

# ssh-keygen
# cat .ssh/id_rsa.pub >> .ssh/authorized_keys
# rsync -r .ssh/ root@ceph2:.ssh/
# rsync -r .ssh/ root@ceph3:.ssh/
# rsync -r .ssh/ root@ceph4:.ssh/

Create the ceph cluster:

# mkcephfs -a -c /etc/ceph/ceph.conf -k /etc/ceph/keyring.admin
temp dir is /tmp/mkcephfs.oqB5qpHXEi
preparing monmap in /tmp/mkcephfs.oqB5qpHXEi/monmap
/usr/bin/monmaptool --create --clobber --add 1 169.254.6.21:6789 --add 2 169.254.6.22:6789 --add 3 169.254.6.23:6789 --print /tmp/mkcephfs.oqB5qpHXEi/monmap
/usr/bin/monmaptool: monmap file /tmp/mkcephfs.oqB5qpHXEi/monmap
/usr/bin/monmaptool: generated fsid e0a0b83d-f188-4baf-82f2-3102fbb1c194
epoch 0
fsid e0a0b83d-f188-4baf-82f2-3102fbb1c194
last_changed 2012-07-17 08:45:35.681299
created 2012-07-17 08:45:35.681299
0: 169.254.6.21:6789/0 mon.1
1: 169.254.6.22:6789/0 mon.2
2: 169.254.6.23:6789/0 mon.3
/usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.oqB5qpHXEi/monmap (3 monitors)
=== osd.11 ===
2012-07-17 08:45:35.792982 7fe7bcf55780 created object store /srv/ceph/osd11 journal /srv/ceph/osd11/journal for osd.11 fsid e0a0b83d-f188-4baf-82f2-3102fbb1c194
creating private key for osd.11 keyring /etc/ceph/keyring.admin
creating /etc/ceph/keyring.admin

...

2012-07-17 08:46:08.993851 7f165d1a6760  adding osd.21 at {host=ceph2,pool=default,rack=unknownrack}
2012-07-17 08:46:08.993895 7f165d1a6760  adding osd.22 at {host=ceph2,pool=default,rack=unknownrack}
2012-07-17 08:46:08.993926 7f165d1a6760  adding osd.31 at {host=ceph3,pool=default,rack=unknownrack}
2012-07-17 08:46:08.993956 7f165d1a6760  adding osd.32 at {host=ceph3,pool=default,rack=unknownrack}
/usr/bin/osdmaptool: writing epoch 1 to /tmp/mkcephfs.oqB5qpHXEi/osdmap
Generating admin key at /tmp/mkcephfs.oqB5qpHXEi/keyring.admin
creating /tmp/mkcephfs.oqB5qpHXEi/keyring.admin
Building initial monitor keyring
added entity osd.11 auth auth(auid = 18446744073709551615 key=AQAPCgVQqGJCMBAA6y4blmINAgB+nrX3wPla2Q== with 0 caps)
added entity osd.12 auth auth(auid = 18446744073709551615 key=AQAPCgVQeKF9NhAAM7EPeskDwikMl1vPi2pWpw== with 0 caps)
added entity osd.21 auth auth(auid = 18446744073709551615 key=AQAWCgVQKKOpAxAAHe7W7KyASI2xnkdOilzSFQ== with 0 caps)
added entity osd.22 auth auth(auid = 18446744073709551615 key=AQAhCgVQiC4aLxAA1pT/rOUHg07MLablCnlppg== with 0 caps)
added entity osd.31 auth auth(auid = 18446744073709551615 key=AQAmCgVQWLCnIhAA692Rhs2rws8yQLrT8vXaBw== with 0 caps)
added entity osd.32 auth auth(auid = 18446744073709551615 key=AQAyCgVQQLIrFBAA/2lJMVPzsBFypCihJubdxg== with 0 caps)
=== mon.1 ===
/usr/bin/ceph-mon: created monfs at /srv/ceph/mon1 for mon.1
=== mon.2 ===
pushing everything to ceph2
/usr/bin/ceph-mon: created monfs at /srv/ceph/mon2 for mon.2
=== mon.3 ===
pushing everything to ceph3
/usr/bin/ceph-mon: created monfs at /srv/ceph/mon3 for mon.3
placing client.admin keyring in /etc/ceph/keyring.admin

Start ceph on all nodes

$ /etc/init.d/ceph -a start

Check the status of Ceph

# ceph -k /etc/ceph/keyring.admin -c /etc/ceph/ceph.conf health
2012-07-17 08:47:56.026981 mon <- [health]
2012-07-17 08:47:56.027389 mon.0 -> 'HEALTH_OK' (0)

# ceph -s
2012-07-17 13:30:28.537300    pg v1228: 6542 pgs: 6542 active+clean; 16 bytes data, 3387 MB used, 1512 GB / 1516 GB avail
2012-07-17 13:30:28.552231   mds e1: 0/0/1 up
2012-07-17 13:30:28.552267   osd e10: 6 osds: 6 up, 6 in
2012-07-17 13:30:28.552389   log 2012-07-17 10:21:54.329413 osd.31 10.X.X.3:6800/31088 1233 : [INF] 3.4 scrub ok
2012-07-17 13:30:28.552492   mon e1: 3 mons at {1=10.X.X.1:6789/0,2=10.X.X.2:6789/0,3=10.X.X.3:6789/0}

Testing RBD backend

Get auth key

# ceph-authtool --print-key /etc/ceph/keyring.admin | tee client.admin
AQADEQVQyAevJhAAnZAcUsmuf8tSLp+7jgXglQ==

Create a pool and a volume in it

# rados lspools
data
metadata
rbd
# rados mkpool nova
# rados lspools
data
metadata
rbd
nova
# rbd --pool nova create --size 1024 rbd-test
# rbd --pool nova ls
rbd-test

Prepare and mount on a node:

# modprobe rbd 
# rbd map rbd-test --pool nova --name client.test --secret client.admin
# dmesg | tail
...
[63851.029151] rbd: rbd0: added with size 0x40000000
[66908.383667] libceph: client0 fsid 95d8f4b8-01d8-4b0d-8534-d4f1d32120c9
[66908.384701] libceph: mon2 169.254.6.21:6789 session established
[66908.387263]  rbd0: unknown partition table

# mkfs.btrfs /dev/rbd0
# mount /dev/rbd0 /mnt
# touch /mnt/rbd-tset
# ls /mnt/
rbd-test

# rbd showmapped
id      pool    image   snap    device
0       nova    rbd-test        -       /dev/rbd0

Cleanup test

umount /mnt
rbd unmap /dev/rbd/nova/rbd-test
# rbd --pool nova rm rbd-test
Removing image: 100% complete...done.
# rbd --pool nova ls

RADOS gateway

It will be installed on the ceph1 server, the following configure apache, the radosgw fcgi script, and the auth beetween radosgw fcgi script and ceph.

Configuration of ceph:

In /etc/ceph/ceph.conf add this:

[client.radosgw.gateway]
        host = ceph1
        keyring = /etc/ceph/keyring.rados.gateway
        rgw socket path = /tmp/radosgw.sock
        log file = /var/log/ceph/radosgw.log

Copy the file on all nodes: scp /etc/ceph/ceph.conf ceph2:/etc/ceph/ceph.conf scp /etc/ceph/ceph.conf ceph3:/etc/ceph/ceph.conf

Installation and configuration of apache2 on ceph1

apt-get install apache2 libapache2-mod-fastcgi radosgw
a2enmod rewrite
/etc/init.d/apache2 restart

Prepare the Apache Virtual Host, file /etc/apache/site-avialable/rgw.conf

<VirtualHost *:80>
        ServerName ceph1.fqdn.tld
        ServerAdmin root@ceph1
        DocumentRoot /var/www


        RewriteEngine On
        RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) /s3gw.fcgi?page=$1&params=$2&%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]

        FastCgiExternalServer /var/www/s3gw.fcgi -socket /tmp/radosgw.sock
        <IfModule mod_fastcgi.c>
                <Directory /var/www>
                        Options +ExecCGI
                        AllowOverride All
                        SetHandler fastcgi-script
                        Order allow,deny
                        Allow from all
                        AuthBasicAuthoritative Off
                </Directory>
        </IfModule>

        AllowEncodedSlashes On
        ErrorLog /var/log/apache2/error.log
        CustomLog /var/log/apache2/access.log combined
        ServerSignature Off
</VirtualHost>

Create the fcgi script /var/www/s3gw.fcgi:

exec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n client.rados.gateway

And make it executable:

# chmod +x /var/www/s3gw.fcgi

Enable the RADOS gateway ?VirtualHost and disable the default one.

# a2ensite rgw.conf
# a2dissite default

Create the keyring for RADOS gateway:

# ceph-authtool --create-keyring /etc/ceph/keyring.rados.gateway
# chmod +r /etc/ceph/keyring.rados.gateway

Generate a new key for RADOS Gateway in the keyring

# ceph-authtool /etc/ceph/keyring.rados.gateway -n client.rados.gateway --gen-key
# ceph-authtool -n client.rados.gateway --cap osd 'allow rwx' --cap mon 'allow r' /etc/ceph/keyring.rados.gateway

Copy this key to the main ceph keyring (mon must be started at least on one node to do that)

# ceph -k /etc/ceph/keyring.admin  auth add client.rados.gateway -i /etc/ceph/keyring.rados.gateway
2012-07-17 18:12:33.216484 7f8a142e8760 read 117 bytes from /etc/ceph/keyring.rados.gateway
2012-07-17 18:12:33.218728 mon <- [auth,add,client.rados.gateway]
2012-07-17 18:12:33.221727 mon.0 -> 'added key for client.rados.gateway' (0)