Differences between revisions 8 and 9
Revision 8 as of 2010-08-06 17:06:02
Size: 15509
Comment: completing ornamenting, mentioning loop mount
Revision 9 as of 2010-08-06 17:42:38
Size: 15753
Comment: simplification, some optical improvements
Deletions are marked like this. Additions are marked like this.
Line 170: Line 170:
Line 211: Line 212:
Line 228: Line 230:
Line 236: Line 239:
Now this triplet (rootfs, vmlinuz, initrd) shall all be tested in combination. It is those Now this triplet (rootfs, vmlinuz, initrd) shall all be tested in combination. If they are
still in that folder owned by the user previously indicated by '$yourid', then move them
to your local directory for the sake of simplicity:

'''physical:'''{{{
yourid=root
mv ~$yourid/initrd.img-2.6.32-5-amd64 ~$yourid/vmlinuz-2.6.32-5-amd64 .
}}}

It is those
Line 241: Line 253:
'''physical:'''{{{
 yourid=root
 
sudo kvm -m 256 -hda root.img -net nic -net user -nographic -vnc :0 \
          -initrd ~$yourid/initrd.img-2.6.32-5-amd64 \
          -kernel ~$yourid/vmlinuz-2.6.32-5-amd64 \
          -append "rootwait root=/dev/sda"

'''physical:'''{{{
sudo kvm -m 256 -hda root.img -net nic -net user -nographic -vnc :0 \
         -initrd initrd.img-2.6.32-5-amd64 \
         -kernel vmlinuz-2.6.32-5-amd64 \
         -append "rootwait root=/dev/sda"

Preparation of an Eucalyptus-ready cloud image

This page describes very down to earth how to create a Debian image, -kernel and -initrd triplet for Eucalyptus cloud computing on the basis of KVM. It uses only commonly known tools, nothing fancy, and is thus practical and educative. This description was performed by Dominique (<domibel>) as part (or a bit beyond of it) of his 2010 Google Summer of Code project with some ornamental editing by Steffen (<moeller>). This description effectively achieves the same that vmbuilder would achieve. However, we find it easier and the skills laid out one can expect to apply in various other situations.

To present an overview on the process:

  • A blank file serves as an install medium
  • Qemu runs a regular Debian installer CD image with that blank file as an empty disk to be installed to
  • That disk is booted directly
  • Basic installations are performed to render the image ready for the cloud, cleanup
  • Copying the initrd and kernel out from the virtual image
  • Transforming the virtual image into a regular loop-mountable image
  • Upload to Eucalyptus installation

The boxes present code that should be executed on your machine. Since we have two machines, this may be irritating at a first read. To allow an easy copy'n'paste, only the short note "physical" or "virtual" is presented on top of every box. Caveat: you never know if this page was possibly modified by someone with a somewhat strange kind of humor. Since quite a few commands are executed as root, truly try to understand what every line is doing.

Retrieve packages for building, other preparations

The following packages should be available to follow these instructions. Also check that you have sufficient disk space available. You need twice the amount of free space available that you want to reserve ask disk space for your image, i.e. at least 2*2GB.

physical:

df

The following yo need for the installation

physical:

apt-get install qemu-kvm xtightvncviewer wget 

Create your own images

Debian offers images with compilations of its latest packages, readily available as burnable ISO files. Further down we will start those on a virtual image to perform a regular Debian installation - on a virtual image. Here we are proposing to download the very latest snapshot of the testing distribution (which is closest to our heart as Debian developers), now so close to the release of the next version those should work (tested June 24th, 2010 and again August 4th), consider falling back to an official release if you experience difficulties. The following blocks should yield a readily uploadable virtual image, if you only paste the command lines to a shell. You need root privileges at some point.

Please decide for a Debian architecture to use.

physical:

#arch=i386
arch=amd64

Now perform the download.

physical:

wget http://cdimage.debian.org/cdimage/daily-builds/daily/arch-latest/${arch}/iso-cd/debian-testing-$(arch)-netinst.iso

Create the raw disk. Note, that we experienced difficulties with the qcow2 format, hence the "-f raw".

physical:

kvm-img create -f raw disk.img 2G

Now boot the netboot CD image. It will greet you (as if you booted from a regular machine) with the Debian installer - but - you don't see it, because we (and this is what we want) gave it the "-nographic" option. Why? Because everyone is doing it, feel free to edit this wiki page and give a proper reason that goes beyond a "we could execute this remotely without installing X".

physical:

sudo kvm -m 256 -cdrom debian-testing-i386-netinst.iso -drive file=disk.img,if=scsi,index=0 -boot d -net nic -net user -nographic -vnc :0

Keep the above process running, start the platform-independent viewer in another shell. If running on the same host (expected) you can leave the field for the hostname to connect empty.

physical:

xtightvncviewer

The Debian installer has become very user friendly. When preparing a cloud image one has some ideas on things to change, well, it is not too bad.

  • hostname - left with default "debian". It would be preferable to have a hostname that is set dynamically in dependency of the IP number that was given by the DHCP deamon in some way
  • Time zone - the setting does not matter
  • Choose a guided installation
  • Set an arbitrary but safe root password, root login should not be used but only be performed via ssh. But we need the password of the root user when first loggin in.
  • We don't necessarily need a to create a user, but it is of no harm either. Just think about consequences when publishing the image, so set passwords properly if you do more than testing.
  • Partitioning - manually, create first partition ext3, no swap

After the installation the thus prepared disk can be booted:

physical:

sudo kvm -m 256 -drive file=disk.img,if=scsi,index=0,boot=on -boot c -net nic -net user -nographic -vnc :0 -redir tcp:5555:10.0.2.15:22

This previous vnc connection will have terminated. Start it again to see the Debian login prompt or investigate problems of various sorts. To have copy and paste, however, connect via ssh:

physical:

 ssh -X -p 5555 guest@localhost

We need to do some cleanup first, we suggest to remove the link to the cdrom, even though it is not ultimately required. Execute this (like the following commands) from a root shell of the virtual image:

virtual:

 vi /etc/apt/sources.list

The following packages shall form the basic stock for the virtual image.

virtual:

 aptitude install curl devscripts nmap locales ntpdate cvs subversion git cmake-curses-gui

Of the above, we somehow need

  • curl to get the IP assigned
  • devscripts since we all want to do various sorts of Debian-associated compilations/developments and the dependencies of devscripts are also helpful ... ok, you don't really need devscripts for the cloud, but just tolerate them for now.
  • nmap to learn about firewalls and all the things that seems strange
  • locales since it makes annoying trouble if it is not available. If LANG=C is set as an environment variable you hsould not need that.
  • ntpdate since synchronisation is essential, be it for MPI or merely for running make across multiple hosts
  • cvs, subversion, git - well, you don't need them, really, unless you are more into development
  • cmake-curses-gui - some love it

For MPI programming also install

virtual:

 aptitude install openmpi-dev openmpi-bin
  • openmpi-bin is needed for the execution of the MPI-savvy programs
  • openmpi-dev is needed to compile them

For the queueing system, which this project is about, we will have a separate section. The packages for the queueing system shall not be part of the bare image but be installed after the image is already running, i.e. with all information on IP numbers and hostnames available.

replace /etc.rc.local

virtual:

cat <<EORCLOCAL > /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

# load pci hotplug for dynamic disk attach in KVM (for EBS)
depmod -a
modprobe acpiphp

# simple attempt to get the user ssh key using the meta-data service
mkdir -p /root/.ssh
echo >> /root/.ssh/authorized_keys
curl -m 10 -s http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key | grep 'ssh-rsa' >> /root/.ssh/authorized_keys
echo "AUTHORIZED_KEYS:"
echo "************************"
cat /root/.ssh/authorized_keys
echo "************************"
exit 0
EORCLOCAL

The following is not completely tested

For MPI, on user account: guest

 ssh-keygen -t rsa

Test, not sure if this works

 cat /home/guest/.ssh/id_rsa.pub >> .ssh/authorized_keys
 chmod 700 .ssh/authorized_keys
 chmod 700 .ssh

Preparation of the kernel and initrd images

Our current virtual image is apparently functional. And knowing about the inter-dependencies of kernels and modules, we don't want to fiddle with that but just use exactly those files, please. Adjustments can still be performed within that root image with the Debian-typical means, which we don't want to explain here.

Copy initrd and vmlinuz to your front end:

virtual:

# say what username to copy to
yourid=root
# determine the IP address of your physical node, which here acts as the gateway
youripnum=$(/sbin/route -n | awk '{print $2}' | egrep -v '^(IP|Gateway|0\.0\.0\.0)')
# but just chose any IP address of a machine that knows how to upload to Eucalyptus
scp /boot/initrd.img-$(uname -r) /boot/vmlinuz-$(uname -r) ${yourid}@${youripnum}:

Delete the line "127.0.1.1 debian" in /etc/hosts for now, we will add one later again when we know them.

virtual:

vi /etc/hosts

For squeeze you have to delete the following file. It is supposed to somehow memorize your network interface, but this has felt unreliable. virtual:

rm -rf /etc/udev/rules.d/70-persistent-net.rules

Don't reboot just now! The file will be regenerated after reboot by /lib/udev/write_net_rules . Instead, shutdown virtual:

shutdown -h now

and extract your root file system's image as described below: physical:

sudo parted disk.img

When in parted, perform physical:

unit b
print

You will see something analogous to the following table:

Number  Start        End          Size         Type      File system     Flags
 1      1048576B     1988100095B  1987051520B  primary   ext3            boot
 2      1989147648B  2146435071B  157287424B   extended
 5      1989148672B  2146435071B  157286400B   logical   linux-swap(v1)

Leave with

physical:

quit

The first partition, does not start at block 0 or block 1, but there is some offset, indicated in bytes. With 512 bytes per block, we know the number of blocks to be skipped to reach the first _real_ byte.

Number of blocks to skip:

 1048576 / 512 = 2048

Number of blocks to read (end-start)/size

 (1988100095 - 1048576) / 512 = 3880960

Now do the actual transfer of the contents, please adapt the size of your disk.

physical:

 dd if=disk.img of=root.img bs=512 skip=2048 count=3880960

for our 2G image this looks like (to be executed on a bash shell) physical:

dd if=disk.img of=root.img bs=512 skip=$((1048576/512)) count=$(( (1988100095-1048576)/512 ))

Now this triplet (rootfs, vmlinuz, initrd) shall all be tested in combination. If they are still in that folder owned by the user previously indicated by '$yourid', then move them to your local directory for the sake of simplicity:

physical:

yourid=root
mv  ~$yourid/initrd.img-2.6.32-5-amd64 ~$yourid/vmlinuz-2.6.32-5-amd64 .

It is those triplets that we will then also explicitly specify for the Eucalyptus images. Since we are using the extracted rootFS, our rootFS has changed from /dev/sda1 to /dev/sda. Please adjust for your respective kernel version, which may have increased when you read this / you may have a different platform:

physical:

sudo kvm -m 256 -hda root.img -net nic -net user -nographic -vnc :0 \
         -initrd initrd.img-2.6.32-5-amd64 \
         -kernel vmlinuz-2.6.32-5-amd64 \
         -append "rootwait root=/dev/sda"

Please adjust the kernel version and architecture for your respective local setup. Now, with the restart, that unfortunate persistency helper file was created again. Please remove it again. Now.

virtual:

rm -rf /etc/udev/rules.d/70-persistent-net.rules

and also (just in case you did something sensitive) also remove your bash history, which is created again when booting next.

virtual:

rm ~/.bash_history

For some further clean, i.e. everything we clean will safe us money or time with every instance created, remove whatever you want to remove.

virtual:

aptitude clean
halt 

== Some tricks ===

The root.img file is a regular image that can be mounted to the local machine.

physical:

mkdir tmp_mnt
sudo mount -o loop root.img tmp_mnt && ls tmp_mnt

If successful, one will see the root directory of root.img under tmp_mnt. That mounting can be used to change files, too, e.g. for physical:

rm tmp_mnt/etc/udev/rules.d/70-persistent-net.rules

to unmount perform

physical:

umount tmp_mnt

Some more bits are possible, e.g. one can run

sudo chroot tmp_mnt

and even install packages without virtualisation. But this is unsafe, rather use the virtual image.

Bundling and uploading to Eucalyptus

We can now proceed as it is described in a regular cloud computing text book, i.e. in the documentation to Eucaluptus. From your physical machine, with the euca2ools package installed, perform the following

physical:

# if the images copied from your virtual machine are not yet in the CWD
# then add links to those or copy them to here.
euca-bundle-image -i vmlinuz-2.6.32-5-amd64 --kernel true
euca-upload-bundle -b amd64 -m /tmp/vmlinuz-2.6.32-5-amd64.manifest.xml
euca-register amd64/vmlinuz-2.6.32-5-amd64.manifest.xml
EKI=eki-04B810D0

euca-bundle-image -i  initrd.img-2.6.32-5-amd64 
euca-upload-bundle -b amd64 -m /tmp/initrd.img-2.6.32-5-amd64.manifest.xml
euca-register amd64/initrd.img-2.6.32-5-amd64.manifest.xml
ERI=eri-3BB111B0

euca-bundle-image -i root.img --kernel $EKI --ramdisk $ERI
euca-upload-bundle -b amd64 -m /tmp/root.img.manifest.xml
euca-register amd64/root.img.manifest.xml
EMI=emi-1B1B0C94

The images are now inspect- and available from the Eucalyptus server. One can also see, that the root/machine image (EMI) is connected to the kernel (EKI) and initrd/ramdisk (ERI) images.

physical:

$ euca-describe-images
IMAGE   eki-D224100C    amd64/vmlinuz-2.6.32-5-amd64.manifest.xml       guest available       public          x86_64  kernel
IMAGE   emi-1B1B0C94    amd64/root.img.manifest.xml     guest available       public          x86_64  machine eki-D224100C    eri-059910F2
IMAGE   eri-059910F2    amd64/initrd.img-2.6.32-5-amd64.manifest.xml    guest available       public          x86_64  ramdisk

To start, we just start, as we did with the offcial Eucalyptus images before

physical:

 $ euca-run-instances -k mykey $EMI
RESERVATION     r-35140741      guest guest-default
INSTANCE        i-28CC058F      emi-1B1B0C94    0.0.0.0 0.0.0.0 pending mykey   0               m1.small        2010-08-04T16:48:45.983Z        debian-rocks     eki-D224100C    eri-059910F2
$ euca-describe-instances i-28CC058F
RESERVATION     r-35140741      guest default
INSTANCE        i-28CC058F      emi-1B1B0C94    0.0.0.0 0.0.0.0 pending mykey   0               m1.small        2010-08-04T16:48:45.983Z        debian-rocks     eki-D224100C    eri-059910F2

and after some time, with the dhcpd been contacted

physical:

$ euca-describe-instances i-28CC058F
RESERVATION     r-35140741      moeller default
INSTANCE        i-28CC058F      emi-1B1B0C94    192.168.100.102  192.168.100.102  running mykey   0               m1.small        2010-08-04T16:48:45.983Z        debian-rocks     eki-D224100C        eri-059910F2