Translation(s): none


The 'euca2ools' are a free clone of the management utilities for the Amazon Elastic Compute Cloud (EC2). To use them, you either need an account with the Eucalyptus Partner Cloud (EPC), install the Eucalyptus cloud infrastructure locally (a respective Debian package is in preparation) or you need an account with Amazon.com's services. The EC2 tools of Amazon are a moving target, nonetheless upstream is eager to keep the differences at a minimum. For any issues that you observe, please contact them via the routes presented under the external links at the bottom of this page.

A Debian package was kindly provided by the upstream developers at Eucalyptus and was recently submitted to the Debian New Queue.

euca2ools with Amazon's Web Services

The following presents a first success story for using the euca2ools for Amazon's services.

Preparation

There are two ways to authenticate oneself on the net: ID+password (most common) or X.509 certificates (increasingly common). The EC2 allows either, but as of version 1.3.1, some commands of euca2ools only manage with the prior. Hence one needs to login to Amazon, chose in the top tab row "Account", then on the left "Security Credentials". The access key (like a regular ID) and the secret access key (like a password) need to be passed somehow to the euca2ools. Do so by setting environment variables, also specify the amazon head node:

export EC2_ACCESS_KEY=0GFSRLKJISFDKUHG
export EC2_SECRET_KEY="8iHgfFj/kGHFh8jVvHkTGfREg+G"
export EC2_URL=http://ec2.amazonaws.com

The same information can also be passed to the commands through configuration files, like ~/.eucarc.

Show regions

There are multiple Amazon clouds. And one pays extra for data transfer between them.

$ euca-describe-regions
REGION  eu-west-1       ec2.eu-west-1.amazonaws.com
REGION  us-east-1       ec2.us-east-1.amazonaws.com
REGION  ap-northeast-1  ec2.ap-northeast-1.amazonaws.com
REGION  us-west-1       ec2.us-west-1.amazonaws.com
REGION  ap-southeast-1  ec2.ap-southeast-1.amazonaws.com

Show zones

$ euca-describe-availability-zones
AVAILABILITYZONE        us-east-1a      available
AVAILABILITYZONE        us-east-1b      available
AVAILABILITYZONE        us-east-1c      available
AVAILABILITYZONE        us-east-1d      available

Using the REGION URLs obtained through euca-describe-regions, one can browse availability zones other than the default us-east-1.

$ EC2_URL=https://eu-west-1.ec2.amazonaws.com euca-describe-availability-zones
AVAILABILITYZONE        eu-west-1a      available
AVAILABILITYZONE        eu-west-1b      available
AVAILABILITYZONE        eu-west-1c      available

The above line makes use of the not so well known feature to set environment variables for a single call to some program by prepending the assignment to the command's invocation.

Describe images

The command euca-describe-images, if executed with no extra arguments, lists those images that one has created oneself. I have just successfully tested the removal of images - so I cannot offer any output any longer :) But there is the "-a" option coming to our rescue. To select images closest to your heart,

The list of images is

$ EC2_URL=https://ec2.amazonaws.com euca-describe-images -a | egrep 'available.+public.+x86_64' | awk '{print $2,$3}' | grep -i debian | sort -k 2
ami-2766824e 07QBCC8V0TC3NZPSA002.ec2-ami-debianlenny64-equinoxmilliwar-06/image.manifest.xml
ami-c3ee09aa 07qbcc8v0tc3nzpsa002.ec2-ami-debianlenny64-equinoxmilliwar-07/image.manifest.xml
ami-1fbe5a76 alestic-64/debian-4.0-etch-base-64-20080630.manifest.xml
ami-e59d798c alestic-64/debian-4.0-etch-base-64-20080802.manifest.xml
...
ami-f957b090 alestic-64/debian-6.0-squeeze-base-64-20090418.manifest.xml
ami-6f729406 alestic-64/debian-6.0-squeeze-base-64-20090614.manifest.xml
ami-61729408 alestic-64/debian-6.0-squeeze-desktop-64-20090614.manifest.xml
ami-d15bbfb8 Kaavo-basic-64-Debian/imod-basic-64-debian.manifest.xml

The above images (disk images, not running images, they'd be called instances) are as 64bit images not compatible with small machines. And those we want for testing purposes to save a few cents. Hence, please also check out this selection:

$ euca-describe-images -a | egrep 'available.+public.+i386' | awk '{print $2,$3}' | grep -i debian | sort -k 2 | grep squeeze
ami-64fe190d alestic/debian-6.0-squeeze-base-20090215.manifest.xml
ami-e048af89 alestic/debian-6.0-squeeze-base-20090418.manifest.xml
ami-adfe19c4 alestic/debian-6.0-squeeze-desktop-20090215.manifest.xml
ami-0256b16b alestic/debian-6.0-squeeze-desktop-20090419.manifest.xml

Running an image

Who runs the image, pays for it, at least with Amazon. Every boot means some 10 cents, every other hour another dime. To ensure that only the individual who pays can log in, one needs to identify oneself to the machine somehow. And this is performed with some asynchronous cryptography, well, it means that when you send a public part to the ec2 server, then nobody can generate the secret part for it. The most common explanation of the approach is to generate two (very) large prime numbers and you send away their product as the public part. And the server only lets someone in who can provide a divisor of that (very very) large product of two primes.

Generating a key

$ euca-add-keypair keypair_for_regular_debian_machine > keypair_private.asc_complete
$ chmod 600 keypair_private.asc_complete # which I had originally forgotten about

Starting the image

$ euca-run-instances ami-e048af89 -k keypair_for_regular_debian_machine
RESERVATION     r-af0d45c6      166691755018    default
INSTANCE        i-43714a2a      ami-e048af89                    pending keypair_for_regular_debian_machine      2009-07-18T19:40:54.000Z        aki-a71cf9ce    ari-a51cf9cc

Status of instances

$ euca-describe-instances
RESERVATION     r-f30d459a      166691755018    default
INSTANCE        i-957249fc      ami-e048af89    ec2-174-129-171-68.compute-1.amazonaws.com      domU-12-31-39-00-55-02.compute-1.internal       running         keypair_for_regular_debian_machine      0       m1.small        2009-07-18T19:40:07.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc
RESERVATION     r-af0d45c6      166691755018    default
INSTANCE        i-43714a2a      ami-e048af89    ec2-174-129-161-80.compute-1.amazonaws.com      domU-12-31-39-00-C5-02.compute-1.internal       running         keypair_for_regular_debian_machine      0       m1.small        2009-07-18T19:40:54.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc

ouch, this is two. I only need one (or "none", since I only started them to teach myself how to start them), so let us remove one (the first, which was started with the original ec2 tools, not with euca2ools and may hence be considered "non-free" -- just kidding)

Stopping an instance

They call it "terminate".

$ euca-terminate-instances i-957249fc
INSTANCE        i-957249fc

and sometimes it feels good to verify that the beast indeed shuts down

$ euca-describe-instances
RESERVATION     r-f30d459a      166691755018    default
INSTANCE        i-957249fc      ami-e048af89                    terminated      keypair_for_regular_debian_machine      0       m1.small        2009-07-18T19:40:07.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc
RESERVATION     r-af0d45c6      166691755018    default
INSTANCE        i-43714a2a      ami-e048af89    ec2-174-129-161-80.compute-1.amazonaws.com      domU-12-31-39-00-C5-02.compute-1.internal       running         keypair_for_regular_debian_machine      0       m1.small        2009-07-18T19:40:54.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc

Loggin in the first time

Check if the instance is ready

$ euca-get-console-output i-43714a2a
...

Listening on LPF/eth0/12:31:39:00:c5:02
Sending on   LPF/eth0/12:31:39:00:c5:02
Sending on   Socket/fallback           
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
DHCPOFFER from 169.254.1.0                                
DHCPREQUEST on eth0 to 255.255.255.255 port 67            
DHCPACK from 169.254.1.0                                  
bound to 10.254.202.240 -- renewal in 39624 seconds.      
done.                                                     
Generating public/private rsa key pair.                   
Your identification has been saved in /etc/ssh/ssh_host_rsa_key.
Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub.
The key fingerprint is:                                         
bd:f7:6f:c2:63:85:c5:b3:0e:4f:a6:66:0f:03:67:d3 host            
The key's randomart image is:                                   
+--[ RSA 2048]----+                                             
|                 |
|                 |
|               . |
|         .    ..o|
|        S .. + Eo|
|           .+.o+.|
|          . .+B. |
|           . =Bo.|
|            o.o*.|
+-----------------+
Generating public/private dsa key pair.
Your identification has been saved in /etc/ssh/ssh_host_dsa_key.
Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub.
The key fingerprint is:
d9:25:a9:4e:26:12:35:56:44:46:d6:81:0b:f7:2f:13 host
The key's randomart image is:
+--[ DSA 1024]----+
|      +=Bo..     |
|     o.+o ..     |
|    .  o oo .    |
|     .  .+Eo     |
|    . . S .o     |
|     . =  o .    |
|        .  o     |
|                 |
|                 |
+-----------------+
ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
/etc/ssh/ssh_host_key.pub: No such file or directory
ec2: 2048 bd:f7:6f:c2:63:85:c5:b3:0e:4f:a6:66:0f:03:67:d3 /etc/ssh/ssh_host_rsa_key.pub (RSA)
ec2: 1024 d9:25:a9:4e:26:12:35:56:44:46:d6:81:0b:f7:2f:13 /etc/ssh/ssh_host_dsa_key.pub (DSA)
ec2: -----END SSH HOST KEY FINGERPRINTS-----
INIT: Entering runlevel: 4
Starting enhanced syslogd: rsyslogd.
Starting OpenBSD Secure Shell server: sshdNET: Registered protocol family 10
lo: Disabled Privacy Extensions
Mobile IPv6
.
Starting periodic command scheduler: crond.

well, looks like it ... The command works non-destructively, btw., i.e. you can repeatedly request the output.

Retrieve password

This is what you'd need to with Windows machines. You parse it from the console output. We don't do this with Debian.

Allow access -- firewall config

$ euca-authorize default -p 80
default None None tcp 80 80 None
GROUP   default
PERMISSION      default ALLOWS  tcp     80      80

We also need ssh. That IP address below should be yours as you are visible from the outside world (not your internal 192.168.1.2 or whatever number). Amazon's documentation suggests to type "what is my IP number" into your favorite search engine and then use any such service. I login to any other remote UNIX machine that I have access to and type "who".

$ euca-authorize default -p22 -s your.ip.address/32
default None None tcp 22 22 your.ip.address/32
GROUP   default
PERMISSION      default ALLOWS  tcp     22      22      FROM    CIDR    your.ip.address/32

Now really loggin in

From the familiar describe-instances command we can retrieve the full hostname. It is not humanely interpretable

$ euca-describe-instances  | grep i-43714a2a  | cut -f 4
ec2-174-129-161-80.compute-1.amazonaws.com

and I refuse to think about them much, so I hide them

$ ssh -i keypair_private.asc_complete root@`euca-describe-instances  | grep i-43714a2a  | cut -f 4`
The authenticity of host 'ec2-174-129-161-80.compute-1.amazonaws.com (174.129.161.80)' can't be established.
RSA key fingerprint is bd:f7:6f:c2:63:85:c5:b3:0e:4f:a6:66:0f:03:67:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-174-129-161-80.compute-1.amazonaws.com,174.129.161.80' (RSA) to the list of known hosts.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0644 for 'keypair_private.asc_complete' are too open.
It is recommended that your private key files are NOT accessible by others.
This private key will be ignored.
bad permissions: ignore key: keypair_private.asc_complete
Permission denied (publickey).

The machine is the right one, as one can somehow verify via the console output ... which I admittedly had not done. I should. Please someone update these instructions towards me saying that I'd done it. Correcting the file permissions, quickly:

chmod 600 keypair_private.asc_complete

And repeating the previous command again...

$ ssh -i keypair_private.asc_complete root@`euca-describe-instances  | grep i-43714a2a  | cut -f 4`
Linux domU-12-31-39-00-C5-02 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:39:36 EST 2008 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

Amazon EC2 Debian testing squeeze AMI built by Eric Hammond
http://alestic.com  http://ec2debian-group.notlong.com

domU-12-31-39-00-C5-02:~#

We are in.

Installing, configuring, and running applications

You can now work with the Debian image as with every regular Debian installation, or better, as with any such that you have virtualised under Xen or VMware or VirtualBox or KVM or ... you name it. The challenge now is to have the software that one wants to work with installed. Since we are root and since we can access the net, this is rather straight foward.

BOINC

The Berkeley Open Infrastructure for Network Computing (BOINC) shall work as an example. It is about regular desktop machines doing some work while they would normally be idle.

Installation

Routine?

# apt-get install boinc-client
Reading package lists... Done                         
Building dependency tree                              
Reading state information... Done                     
E: Couldn't find package boinc-client                 

This is really a base system. So we need to get the package lists, first.

# apt-get update              
Get:1 http://security.debian.org squeeze/updates Release.gpg [835B]
...
Get:9 http://http.us.debian.org squeeze/contrib Packages [62.1kB]
Get:10 http://http.us.debian.org squeeze/non-free Packages [109kB]
Fetched 6175kB in 18s (328kB/s)
Reading package lists... Done
domU-12-31-39-00-C5-02:~# apt-get install boinc-client
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  schedtool
Suggested packages:
  boinc-app-seti boinc-manager kboincspy
The following NEW packages will be installed:
  boinc-client schedtool
0 upgraded, 2 newly installed, 0 to remove and 126 not upgraded.
Need to get 467kB of archives.
After this operation, 1033kB of additional disk space will be used.
Do you want to continue [Y/n]?
Get:1 http://http.us.debian.org squeeze/main boinc-client 6.4.5+dfsg-2 [442kB]
Get:2 http://http.us.debian.org squeeze/main schedtool 1.3.0-1 [24.5kB]
Fetched 467kB in 1s (406kB/s)
Preconfiguring packages ...
Selecting previously deselected package boinc-client.
(Reading database ... 16549 files and directories currently installed.)
Unpacking boinc-client (from .../boinc-client_6.4.5+dfsg-2_i386.deb) ...
Selecting previously deselected package schedtool.
Unpacking schedtool (from .../schedtool_1.3.0-1_i386.deb) ...
Processing triggers for man-db ...
Setting up boinc-client (6.4.5+dfsg-2) ...
Starting BOINC core client: boinc.
Setting up scheduling for BOINC core client and children: idle, batch.
Setting up schedtool (1.3.0-1) ...

We could now also install the boinc-app-seti as the science app, but since the CUDA machines are so much faster than this virtual one, there is little motivation to run it on these virtual machines. We don't want the boinc-manager since it would drag in all the X window packages and we have not started the desktop (which has such extraneous bits) but the base Debian (bones only) system.

Configuration

Let us go for FightAids@Home for a night with the WorldCommunityGrid.org initiative. We need an account key for BOINC, which is on the profile page. Every user has a different one, so you need to login to it and create an account on that web page if you don't have one yet. One then only needs to call "attach" with that account number:

# boinccmd --project_attach http://www.worldcommunitygrid.org f61bxxxxxxxxxxxx332b

and the BOINC client does the rest. With "top" one can then observe the application to take 45% of the CPU time. Some insights by someone why it is not 100% would be lovely.

Status check

Now we want to learn about the progress the client is experiencing.

# boinccmd --get_state
======== Projects ========                    
1) -----------                                
   name: World Community Grid                 
   master URL: http://www.worldcommunitygrid.org/
   user_name: me                    
   team_name: Wikipedia                          
   resource share: 100.000000                    
   user_total_credit: 220965.993641              
   user_expavg_credit: 358.576267
...
======== Application versions ========
1) -----------
   application: faah
   version: 6.07
   project: World Community Grid
======== Workunits ========
1) -----------
   name: faah7404_ZINC04967729_xmdEq_1MSN_00
...
======== Results ========
1) -----------
   name: faah7404_ZINC04967729_xmdEq_1MSN_00_0
   WU name: faah7404_ZINC04967729_xmdEq_1MSN_00
   project URL: http://www.worldcommunitygrid.org/
   report deadline: Tue Jul 28 21:44:50 2009
   ready to report: no
...
   estimated CPU time remaining: 45867.073447
   supports graphics: no

Or more quickly, after overcoming some initial overhead ...

# boinccmd --get_state | grep "remaining"
   estimated CPU time remaining: 43626.352060

Benchmarking

These 43000s are about 11 hours, which is not ultimately impressive. It should be around 6 hours. Curiosity does not stop here, so we look at BOINC's benchmarks:

# boinccmd --get_messages 0 | grep CPU
 1 1247953477 Running CPU benchmarks
 1 1247953477 Suspending computation - running CPU benchmarks
 1 1247953509    Number of CPUs: 1
 1 1247953509    2160 floating point MIPS (Whetstone) per CPU
 1 1247953509    6114 integer MIPS (Dhrystone) per CPU

And those numbers are rather impressive. But since timing is said to be an issue with Xen-based machines, we probably should not be too excited for the moment.

Update: the first workunits took some 10 hours. This is kind of in par with a 1.8GHz 32bit Intel PC of the year 2001.

Result Name

Device Name

Status

Sent Time

Time Due / Return Time

CPU Time (hours)

Claimed/ Granted BOINC Credit

?faah7404_ ZINC04967729_ xmdEq_ 1MSN_ 00_ 0--

domU-12-31-39-0
0-C5-02

Valid

18.07.09
21:44:50

19.07.09
15:14:15

4.30

72.1 / 63.5

The second machine (extra redundancy by the project) took same as long, and was most likely not virtualised. It may just have been a big task that was addressed. Experience will tell, a second night confirms the initial measurements.

Shutting down

How to terminate an instance was shown before. We could also just call shutdown as we would on our desktop machine's console.

From instance to image

This section is very early work in progress

On our regular (non-virtualised) desktops, upon reboot the machine is more or less in the state that we would have with our instance after closing all applications. And if your application starts automatically, and if it (like BOINC) can do checkpointing, then one can continue with work when the cleaning lady has removed the plug or when your little one had some interest in the lamps of your PC. With Amazon, such hard reboots are less likely. The challenge here is more on

However: with Amazon's instances, every change done to the original image is lost upon a restart. And it is not yet present when a second instance is started - the restart truly is the termination of the first instance and a new start of a second. The way out is to save a (modified - otherwise there is no point) instance as an image.

For us, the challenge is to prepare a BOINC-AMI without the need for us to reinstall everything starting from the blank Debian distribution, e.g. to play with a faster node than the current instance's one - a node wrongly classified as 'small', it should rather be 'slow'.

Reserve storage space

Every user of the EC2 has access to the S3 storage facilities of Amazon. Hence, there is no extra effort for us. We only eventually need our Access keys since the S3 cannot deal with certificates. Since that storage is metered and billed, we now profit from having used the smallest possible base before - we don't want to pay for X libraries that we don't need. Somebody please demonstrate that the cloud-run client can indeed be accessed with a locally executed boinc-manager.

Image preparation: bundling

The term "bundling" seems wrong at first glance. It is named this way since the (presumed large) image is not uploaded in one single bit but as many smallish bits. This eases the handling of communication errors. The bundling is performed with the ec2/euca tools themselves. And for the snapshot of the running instance, these must be installed on the running instance. Since the euca tools are not yet apt-get-able from within Debian, this tutorial continues with the ec2 tools that the selected Debian image already provides.

Install ec2/euca tools

The package to look for is euca2ools.

Transfer of credentials

While bundling, the bundler needs to ensure that only the bundler and amazon are eligible to decrypt the image. Consequently, the bundler needs to learn how to encrypt the image and for that - well - there are technical ways (as in GPG keys) that it would only need a public key for that purpose. But for some reason (sombody please explain or apologise) we are asked to upload our intimate .pem files with our two keys (both access and secret).

This really hurts. We need to trust our image to be non-accessible or others can start many many BOINC clients (or malicious apps) from our purse and name. And wen need to make sure we don't accidentially place them into a folder that will become part of the new image. The credentials are hence placed under /mnt, a directory spared from the snapshot, one a very different partition:

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.9G  657M  8.8G   7% /
tmpfs                 854M     0  854M   0% /lib/init/rw
udev                  854M   24K  854M   1% /dev
tmpfs                 854M     0  854M   0% /dev/shm
/dev/sda2             147G  188M  140G   1% /mnt

To upload the credentials, perform a scp

$ scp -i keypair_private.asc_complete  $HOME/.ec2/*.pem root@ec2-174-129-161-80.compute-1.amazonaws.com:/mnt/
cert-4U2XXXXXXXXXXXXXXXXXXXXXXXXXXX6TD.pem           100%  916     0.9KB/s   00:00
pk-4U2XXXXXXXXXXXXXXXXXXXXXXXXXXX6TD.pem             100%  926     0.9KB/s   00:00

Invocation of bundling

If we now execute the ec2-bundle-vol command rightway as found in the documentation, we will experience the surprise to have our application data as part of the image. But this application may soon be outdated, and certainly outdated are our client data.

# #don't do this rightaway# ec2-bundle-vol -d /mnt -k /mnt/p*.pem -c /mnt/c*.pem -u 0GFSRLKJISFDKUHG -r i386 -p boincimage

So, it seems plausible to exclude our variable data somehow. Others must have had this problem before, the man page tells us to use the "-e" flag to exclude directories. But what directories? While comparing the output of "dpkg -L boinc-client" with the directories found to actually contain boinc-related files, we found the folder

domU-12-31-39-00-C5-02:~# du -sh /var/lib/client
65M     /var/lib/boinc-client/

and in there two dominant subfolders

domU-12-31-39-00-C5-02:~# find !$
find /var/lib/boinc-client/      
/var/lib/boinc-client/           
/var/lib/boinc-client/get_current_version.xml
/var/lib/boinc-client/client_state.xml       
/var/lib/boinc-client/slots                  
/var/lib/boinc-client/slots/0                
/var/lib/boinc-client/slots/0/xmdEq_1MSN.pdbqt
/var/lib/boinc-client/slots/0/boinc_lockfile  
/var/lib/boinc-client/slots/0/receptor.NA.map 
/var/lib/boinc-client/slots/0/ZINC04160907_xmdEq_1MSN_01.gpf
/var/lib/boinc-client/slots/0/wcg_checkpoint_0f.ckp         
/var/lib/boinc-client/slots/0/wcg_checkpoint_0b.ckp         
/var/lib/boinc-client/slots/0/wcg_checkpoint_02.ckp         
/var/lib/boinc-client/slots/0/receptor.HD.map               
/var/lib/boinc-client/slots/0/wcg_checkpoint.dat            
/var/lib/boinc-client/slots/0/wcg_autodock4.dlg             
... mostly task-specific files
/var/lib/boinc-client/projects                                        
... mostly general project files and images for various projects

of which one seems to have most of the data

domU-12-31-39-00-C5-02:~# du -sh /var/lib/boinc-client/*
...
7.1M    /var/lib/boinc-client/projects
...
58M     /var/lib/boinc-client/slots
...

and if the boinc-client is sufficiently stable in its reaction to errors, then we can drag an image of the project folder while leaving the slots folder aside. And the hope is that the client will not attempt to reload outdated files ... . And if it does or of the application crashes, then this is an Open Source program and we can either investigate ourselves or have folks to talk to. This is how the snapshot is now created (finally)

domU-12-31-39-00-C5-02:~# ec2-bundle-vol -d /mnt -e /var/lib/boinc-client/slots -k /mnt/p*.pem -c /mnt/c*.pem -u 1333-222-1118 -r i386 -p boincimage
Copying / into the image file /mnt/boincimage...
Excluding:
         /selinux
         /sys
         /proc/bus/usb
         /proc
         /dev/pts
         /dev
         /media
         /mnt
         /proc
         /sys
         /etc/udev/rules.d/70-persistent-net.rules
         /etc/udev/rules.d/z25_persistent-net.rules
         /var/lib/boinc-client/slots
         /mnt/boincimage
         /mnt/img-mnt
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.002658 s, 394 MB/s
mke2fs 1.41.3 (12-Oct-2008)
Bundling image file...

 ... 5 minutes wait here ...

Splitting /mnt/boincimage.tar.gz.enc...
Created boincimage.part.00
Created boincimage.part.01
Created boincimage.part.02
...
Created boincimage.part.13
Created boincimage.part.14
Created boincimage.part.15
Generating digests for each part...
Digests generated.
Unable to read instance meta-data for product-codes
Creating bundle manifest...
ec2-bundle-vol complete.

Please note that the exlusions are happening for different reasons. /mnt is excluded because it is a mounted directory. The -a flag circumvents that, then take care of your credentials. The /mnt/{boincimage,img-mnt} files are excluded because the "-d /mnt" flag implicitly defines these directories as destinations or interim files. The /var/lib/boinc-client/slots we have manually excluded ourselves. The others seem to be excluded by principle.

Interesing was that the user is identified by the account number, not by the access key ID, as read everywhere. Seems like there is some confusion about it:

domU-12-31-39-00-C5-02:~# ec2-bundle-vol -d /mnt -e /var/lib/boinc-client/slots -k /mnt/p*.pem -c /mnt/c*.pem -u 0GFSRLKJISFDKUHG -r i386 -p boincimage
--user has invalid value '0GFSRLKJISFDKUHG': the user ID should consist of 12 digits (optionally hyphenated); this should not be your Access Key ID
Try 'ec2-bundle-vol --help'

The description of the euca-bundle-vol has it right. In my install at least, the ec2-bundle-vol has no help available.

Upload image to S3

The first thing we need is a "bucket" (some may prefer calling it a "directory"). I have forgotten about how I was doing it the first time around. This time I am using the firefox plugin from s3fox.net. I don't necessarily think that it is an optimal tool (it has forgotten the 17th file while i wanted to delete my previous (big) folder with the Debian-Med AMI and therefore could not delete the folder afterwards, entering the access keys and account number was implemented clumsily) but it seems to work and I am thankful for it. Create a new folder, like 'boinc-debian-us', or better - just don't. A typo that I had produced tells me that the bucket will be created for you by the upload tool :)

The upload the files there

# ec2-upload-bundle -b boinc-debian-us -m /mnt/boincimage.manifest.xml -a $EC2_ACCESS_KEY -s $EC2_SECRET_KEY
Creating bucket...
Uploading bundled image parts to the S3 bucket boinc-debian-us ...
Uploaded boincimage.part.00
Uploaded boincimage.part.01
Uploaded boincimage.part.02
...
Uploaded boincimage.part.13
Uploaded boincimage.part.14
Uploaded boincimage.part.15
Uploading manifest ...
Uploaded manifest.
Bundle upload completed.

Registering the image - so we can refer to it

The registration gives us the AMI ID. And only with that we can start an instance. It is now again executed from our local host - why I don't see any reason why it could not possibly be executed from the cloud. Comments anyone?

$ euca-register boinc-debian-us/boincimage.manifest.xml
IMAGE   ami-6d39d804

Testing the new image

Now I definitely want to run it. But on a fast machine. And I keep the old one going to allow for a comparison. For preparing this tutorial, so far Amazon wants 3 Euros from me - no movie this month. The instance type that one has now to chose I found after some moments in google. It is c1.medium, not m1.medium. Well, this certainly has a reason.

$ # don't do this # euca-run-instances --instance-type c1.medium ami-6d39d804
RESERVATION     r-4fe3ab26      166691755018    default
INSTANCE        i-e5516b8c      ami-6d39d804                    pending None    2009-07-19T23:41:13.000Z        aki-a71cf9ce    ari-a51cf9cc

And now the loggin in is not possibly because we don't have a password. We were just too eager to get the instance up. Scrolling back a bit and forth again, this is what should have been executed.

$ euca-run-instances -k keypair_for_regular_debian_machine --instance-type c1.medium ami-6d39d804
RESERVATION     r-1fe4ac76      166691755018    default
INSTANCE        i-cd576da4      ami-6d39d804                    pending keypair_for_regular_debian_machine      2009-07-19T23:50:20.000Z        aki-a71cf9ce    ari-a51cf9cc
$ euca-terminate-instances i-e5516b8c
INSTANCE        i-e5516b8c

Also, it is not a good idea to execute the 'euca-run-instances ...' command several times in the dire hope to see a change in the 'pending' state. No good idea, indeed, since every execution will cost you 20 cents. Use "-n 10" if you want 10 instances, instead.

RESERVATION     r-af0d45c6      166691755018    default
INSTANCE        i-43714a2a      ami-e048af89    ec2-174-129-161-80.compute-1.amazonaws.com      domU-12-31-39-00-C5-02.compute-1.internal       running         keypair_for_regular_debian_machine      0       m1.small        2009-07-18T19:40:54.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc
RESERVATION     r-1fe4ac76      166691755018    default
INSTANCE        i-cd576da4      ami-6d39d804    ec2-174-129-94-64.compute-1.amazonaws.com       domU-12-31-39-00-81-21.compute-1.internal       running         keypair_for_regular_debian_machine      0       c1.medium       2009-07-19T23:50:20.000Z        us-east-1b      aki-a71cf9ce    ari-a51cf9cc

It should not be required to add further firewall permissions since the group of the instance is the "default" group just as it is for the first instance. If you have a dynamic IP address, though, it might nonetheless be a reasonable idea to ensure yourself that it has not changed.

$ ssh -i keypair_private.asc_complete root@ec2-174-129-94-64.compute-1.amazonaws.com
The authenticity of host 'ec2-174-129-94-64.compute-1.amazonaws.com (174.129.94.64)' can't be established.
RSA key fingerprint is bd:f7:6f:c2:63:85:c5:b3:0e:4f:a6:66:0f:03:67:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-174-129-94-64.compute-1.amazonaws.com,174.129.94.64' (RSA) to the list of known hosts.
Linux domU-12-31-39-00-81-21 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:39:36 EST 2008 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

Amazon EC2 Debian testing squeeze AMI built by Eric Hammond
http://alestic.com  http://ec2debian-group.notlong.com

Last login: Sun Jul 19 21:10:04 2009 from 85.179.237.210
domU-12-31-39-00-81-21:~#

We are in and ....

top - 00:01:36 up 10 min,  1 user,  load average: 1.99, 1.67, 0.87
Tasks:  47 total,   4 running,  43 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy, 81.6%ni, 14.1%id,  0.0%wa,  0.0%hi,  0.0%si,  4.2%st
Mem:   1788724k total,   341048k used,  1447676k free,     3908k buffers
Swap:   917496k total,        0k used,   917496k free,    87544k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  865 boinc     39  19  216m  99m 1044 R   95  5.7   8:20.26 wcg_faah_autodo
  862 boinc     39  19  216m  99m 1060 R   95  5.7   8:46.38 wcg_faah_autodo
    1 root      15   0  2156  724  628 S    0  0.0   0:00.52 init

have BOINC munching along with two instances and 95% CPU time each. This can mean everything and nothing now. Real life will tell another 24 hours and $4.80 later. The messages tell that he downloading of new workunits was successful, but is could be done more nicely. The benchmarks are not completely different, maybe even worse, but we get a 100% and have twice as many CPUs.

 1 1248048416    Number of CPUs: 2
 1 1248048416    2263 floating point MIPS (Whetstone) per CPU
 1 1248048416    5817 integer MIPS (Dhrystone) per CPU

Update: indeed we now only need around four to five wall-clock hours for a single workunit:

Result Name

Device Name

Status

Sent Time

Time Due / Return Time

CPU Time (hours)

Claimed/ Granted BOINC Credit

faah7408_ ZINC00649952_ xmdEq_ 1MSN_ 04_ 0--

domU-12-31-39-0
0-81-21

Valid

7/19/09
23:52:19

7/20/09
05:54:54

3.61

59.4 / 56.8

faah7408_ ZINC00662498_ xmdEq_ 1MSN_ 02_ 0--

domU-12-31-39-0
0-81-21

Pending Validation

7/19/09
23:51:59

7/20/09
05:54:54

4.34

71.4 / 0.0

The "Medium" workstations are apparently truly worth their extra money - relatively speaking.

Describing the new AMI

A description of the AMI would be beneficial for every AMI, but Amazon seems to support this only for those that are to be shared. Look at this url to publish your image. There is probably speaking nothing against publishing the image that was now created, except that the user needs to be changed after every start.

Outlook

There are a few challenges that one would now be tempted to address

Someone please feel free to continue...

See also

External links


CategorySoftware | CategoryVirtualization | CategorySystemAdministration