work in progress

#language en

Booting a Linux system over the Network from a Debian server

Work In Progress

This page is still being written. It is not complete. Even the parts that are written may not be fully correct as bugs are discovered during testing...

Preface

Booting an existing system over the network must not be confused with ways of installing a new Debian system DebianNetworkInstall or PXEBootInstall. Both of those apporaches result in a standalone machine which can boot without server support. Parts of this page are similar or identical to PXEBootInstall but are repeated here for convenience.

This page describes using Debian as a server to store the already-installed operating system for another machine (which we will call the client).

If you are setting up an internet cafe, then you may like to look at the Linux Terminal Server Project LTSP which may already do what you want. If you are looking to do something different, or to learn from the experience, then the following may interest you more.

In the Web, several articles can be found that describe in more or less detail how to setup a network boot server. for installation or for regualr running/ They all have the same weakness: You are required to execute a long list of instructions without getting any feedback before the very end of the procedure when you try to boot. If it works, fine. If not, debugging will become very very difficult. Therefore in the following we break down the procedure into steps that can be debugged separately.

This also allows you to become familiar with various error messages one at a time, which may support your confidence later when you see a similar error in a production environment. This page does not describe the quickest method, but one that is hoped to lead to the best understanding of the process.

Preconditions

The computer or computers that load their operating systems over the network will be called the client(s).

The computer containing the operating systems will be called the server, and note that it contains both its own OS and that of the client(s).

We assume here that the Server is running Debian Jessie, though in fact these instructions have also been tested more widely as mentioned below.

The client OS need not be Debian, but if not you need to chack that the kernel support NFS "out of the box", and that it has a similar mechanism for adding a script to the initrd. If you do not know what that means, stick with Debian Jessie.

How to install NFS support on other distros and other Unices is beyond the scope of this wiki -- try asking for support on the forums or wkis of the other OS.

Debian generic kernels already have this NFS support "out of the box", and as we will see below we can enable it with simple changes made at boot time.

We assume that the Client and the Server are part of a LAN with the following IP addresses, and of course you are free to adapt these to your own needs:

You will find out the value of x later.

Note that many routers also provide a DHCP server: it is best to turn it off, since only one normally configured DHCP server can run in a given LAN. There are two exceptions.

Firstly you may be able to configure your router's DHCP sever to comply with the ISC DHCP server configuration below, in which case you can simply omit the steps installing the ISC DHCP server. Re-configuring your router is outside of the scope of this document.

Secondly, it is possible to configure the ISC DHCP server in what is know as "proxy mode". In this mode it will peacefully co-exist with your router, allowing the router to serve the network configs, and then itself adding a "PS" with the boot details. This too is outside the scope of this document, but the settings are documented online - search for "isc-dhcp-server proxy configuration". This is ideal when for whatever reason it is not possible to change the router (the difficulties could be technical, or to with rules imposed by the owner of the router). However, proxy DHCP service is depreated wherever it is not absolutely essential, as it does make trouble shooting more difficult.

Finally, you may prefer that the router and the server are the same machine, i.e. that your Debian server is the default gateway for this LAN. This will work fine after amending one or other of the given addresses.

The following instructions have been tested with clients running Debian 8.2 (jessie) in January 2017, and with servers running Debian 8.2, Linux Mint 18.1 Serena, and Linux Mint Debian Edition 2 Betsy. The presence of a GUI running on the server did not noticeable affect performance on a lightly loaded client, nor did the client seem to affect the GUI on the server.

Get the Client OS running, standalone, on the Server

This sounds a bit paradoxical at first, but eventually we are going to serve the client filesystem from the server, so we may as well have the client OS files on the server right from the start.

This approach does entail that (plus or minus a few disk drives) the hardware is reasonably similar. If the hardware differs significantly you will need to prepare the client OS on the client machine and copy the filesystem across, or perhaps use a virtual machine.

In setting up the client OS, please keep the following in mind

* mount / as ro -- Note that you can set mount options in the Debian installer -- setting ro at install time saves a lot of corrective work later on. We want as much as possible of the client OS to be readonly, both to make NFS more efficient and to reduce the number of possible concurrency issue if we boot more than one client concurrently.

* mount /boot on its own partition and with rw access

* mount /var separately and with rw access

* mount /home separately and with rw access

* If you have enough memory it is better to tun the clients without using any swap at all (2Gb or more, certainly with 4Gb upwards). If you do need swap, try if possible to have this local to each client as swap than trying to swap over the network!

* If despite the above, you do choose swap-over-LAN, remember you will need room for separate swap partitions for each client ;)

On your first boot into the client system, remount / as rw to make some changes

 # mount -o remount,rw /

Then add lines to /etc/fstab to mount tmpfs over the directories /tmp and /run/tmp. DO NOT do the same with var/tmp as the FHS requires these temporary files to persist across boot.

tmpfs.tmp     /tmp     tmpfs      defaults  0  0
tmpfs.run     /run/tmp tmpfs      defaults  0  0

Some progarams ignore the FHS and insist on writing into /etc or other areas that are supposed to be usable with a read only filesystem.

We will assume that these programs will want separate values for each client on which they run. For example, the ntpd daemon (which keeps your computer in step with internet time) wants to write to /etc/adjtime, and these adjustments are specific to one set of hardware and even on the same hardware they may change from time to time. So we set up some space for them under /var, but also allowing us to hold more than one value.

Firstly, by keeping /etc readonly, we will have issues with mtab. The fix is well known:

  # ln -sf /proc/self/mounts /etc/mtab

Set up the following in /var to hold various writable files that we relocate from elsewhere

  # mkdir -p /var/myself

Then for each file that we need to move into writable space, do the following

  # mv /etc/adjtime /var/myself/adjtime
  # ln -s /var/myself/adjtime /etc/adjtime

So we have moved etc/adjtime from a read only space to read write. The reason we put all these files together under a single, second level directory will become clear later)

Still using the ntp server as an example, we need to do something about its "drift" file. This correctly appears in /var space in the default config, but the problem is that the "drift" will be different for every client. If we do not provide different drift files, a single client will take a long time to settle down, and if multiple clients are run together they will never settle down to good timekeeping.

Edit /etc/ntp.conf, find the line

  driftfile /var/lib/ntp/ntp.drift

and comment it out, adding a new line below it, thus:

  #driftfile /var/lib/ntp/ntp.drift
  driftfile /var/clients/myself/ntp.drift

and then move the current driftfile (but not while its program is running!)

  # systemctl stop ntp
  # mv /var/lib/ntp/ntp.drift /var/clients/myself/ntp.drift
  # systemctl start ntp

Reboot, leave the / filesystem as readonly, and see what else complains.

Where something is attempting to write to a place under /etc, move it to /var/myself as we did for /etc/adjtime and either link to it from its origianal place, or update a config file.

Do the same wherever a distinct value needs to be kept for each client computer.

It may take some experimentation to get this totally right. Some guidance may be found in the page ReadonlyRoot, especially the section on Special files in /etc -- but adapt the advice there to move relevant files into /var/clients/myself or into directories within that.

Notice we do not need to do anything to cater for different users: each has their own directory under /home, and the OS will handle this appropriately even over NFS.

Set up the server base systen

Going back to the server hardware, we are now going to install the server software.

Note - trying to use the SAME OS install to run the server as you are serving as client introduces odd problems. Even if they are both Debian Jessie, do have them as distinct intallations on your server in their respective, different, partitions.

So we make a new installation of Debian Jessie using diffeent partitions than the ones we used for the client software.

The followjng instructions have been tested with a fresh install of Debian Jessie from the ?NetInstall ISO. I deselected all Desktop and GUI options, installing only SSH on top of the base install. This worked well, and you may not even need SSH.

At the other extreme, the same instructions have also been tested, successfully, with versions of Linux Mint based directly or indirectly on Debian Jessie and with a Desktop GUI and copious installed software.

The instructions have not been tested outside the Debian "downstream family".

It also seems that a variety of base systems will work, anything from a minimal Jessie install to a full-scale GUI.

PXE boot

So far we have our server machine set up so that it will dual boot into either server or client modes, but have not yet done anything with the client hardware.

It is time to turn to the client machine which I assume currently has no software installed other than the BIOS it came with. It may not even have a hard drive.

Setup the BIOS boot menu of the Client to boot from the network. Exactly how to do this depends on your motherboard. On many machines you can make this selection just for the current boot, or for future boots generally. There will be some kind of key stroke to hit during BIOS POST.

You may like to try any of Esc, F2, F9, F10, F12 as soon as the spalsh creen appears after power on.

Selecting PXE (network) booting when there is no network boot server typically produces an output that contains the Client's MAC address. Then, it will fail with something like

  PXE-E53: no boot filename received.

Of course not! We didn't do that stage yet!

Note the MAC address, it will be helpful for interpreting log messages later.

On many servers, it is also possible to temporary switch to PXE boot without permanently changing the BIOS settings.

Set up DHCP server

On the Server, we need to set up a DHCP server. Your existing DHCP server needs to be turned off about now -- certainly before you reboot this server below.

Current best practice seems to be to use the package isc-dhcp-server, which provides a daemon dhcpd.

It's configuration file is /etc/dhcp/dhcpd.conf. Modify this file so that it contains about the following; adapt IP and MAC addresses to your local needs:

default-lease-time 600;
max-lease-time 7200;

allow booting;

# in this example, we serve DHCP requests from 192.168.0.(3 to 253)
# and we have a router at 192.168.0.1
subnet 192.168.0.0 netmask 255.255.255.0 {
  range 192.168.0.3 192.168.0.253;
  option broadcast-address 192.168.0.255;
  option routers 192.168.0.1;             # our router
  option domain-name-servers 192.168.0.1; # our router, again
  filename "pxelinux.0"; # (this we will provide later)
}

group {
  next-server 192.168.0.2;                # our Server
  host tftpclient {
    filename "pxelinux.0"; # (this we will provide later)
  }
}

After each modification of the above, restart the DHCP server with

  # systemctl restart isc-dhcp-server

Check that it is actually running:

  # systemctl status isc-dhcp-server

which usually gives some recent log output too.

Immedaitely before rebooting the client, you may like to run

  # journalctl -fu isc-dhcp-server

which shows you the last few lines of the DHCP server log, then updates the screen with each new log entry.

If you have other machines connected to the server, check that they are being served network information. Reboot at least one of them, or unplug a network cable and replug it. You should see them being given a network address on the server screen. More importantly, check that they can still reach the internet. If not, sort this out before going on to the client -- did you type the network details correctly in the dhcpd.conf file?

Reboot the Client machine. On success, it will output the IP addresses of the Server ("DHCP"), of the router ("Gateway") and of itself (192.168.0.x). Then it will hang with a TFTP request, and finally write the error message:

  PXE-E32: TFTP open timeout

and at the same time you will see log messages on the server screen showing the DHCP requests and offers similar to the output below the alternative command below

If you prefer not to use systemd, or wish to compare the traditional log output for diagnostic purposes, you can look up /var/log/syslog, for example with this command

  # grep DHCP /var/log/syslog

Jun  3 09:53:46 server dhcpd: DHCPDISCOVER from 40:01:1c:47:44:1e via eth0
Jun  3 09:53:47 server dhcpd: DHCPOFFER on 192.168.0.3 to 40:01:1c:47:44:1e via eth0
Jun  3 09:53:51 server dhcpd: DHCPREQUEST for 192.168.0.3 (192.168.0.2) from 40:01:1c:47:44:1e via eth0
Jun  3 09:53:51 server dhcpd: DHCPACK on 192.168.0.3 to 40:01:1c:47:44:1e via eth0

(Note that earlier Debian releases used /var/log/daemon.log instead of syslog)

If nothing appears in the log with either command, check the network links between the Server and the Client. Note that some switches may impose severe limitations on DHCP traffic; for Cisco ones, use 'portfast' if possible (see http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00800b1500.shtml).

Set up TFTP server

Next, we need to set up a TFTP server on the Server.

Again, there are several packages that provide TFTP (trivial FTP, unsafe, to be used in LAN's only). It seems best practice use the package tftpd-hpa. (It's lead author is H Peter Anvin, who also gave us Grub and Syslinux, so it has good pedigree.)

On installation, a few question are asked. The response to these questions goes into a configuration file, /etc/default/tftpd-hpa. There should be no need to modify the following default contents:

  TFTP_USERNAME="tftp"
  TFTP_DIRECTORY="/srv/tftp"
  TFTP_ADDRESS="0.0.0.0:69"
  TFTP_OPTIONS="--secure"

Ignore older Web sites that instruct you to insert something like 'RUN_DAEMON="yes"'.

After each modification of the above configuration file, restart the TFTP server with

  # systemctl restart tftpd-hpa

On jessie the directory /srv/tftp will be automatically created. This means the next two steps are not necessary if you use jessie.

Initially, on pre-jessie versions, this might fail with a message like

  Restarting HPA's tftpd: in.tftpd/srv/tftp missing, aborting.

Therefore, as root, create the directory /srv/tftp. Restart the TFTP daemon. Check that it is actually running:

  # systemctl status tftp-hpa

It is useful to test your TFTP server with a TFTP client, you may simply use the tftp-hpa package for this purpose:

  # cd /tmp
  # uname -a >/srv/tftp/test
  # tftp 192.168.0.2
  tftp> get test
  tftp> quit
  # diff test /srv/tftp/test
  (nothing, they are identical)

It is also useful to see what log entries you get when you download a file that exists, and when you try to download one that doesn't. While using tftp to test your tftpd server, try tracking your experiemnts with old and new forms of the log command while you are using your tftp client to download files that do, and files that do not, exist.

Monitor the results of your experiemnts by following the server log

  # tail -f /var/log/syslog

(Note As of January 2017. the corresponding systemd command does not display file requests for files that do not exist, which can be crucial for debugging issues around missing files, so for this stage the traditional command, as given above, is preferable)

Reboot the Client while the above command is running on the server. You should see error messages on the client screen starting with

  PXE-T01: File not found

which is quite correct since we did not yet provide any relevant files ;)

Notice that it is a different error message than before - the client does at least know a filename to ask for. On the server screen you will see exactly what the client is asking for (RRQ) and the messages back saying that file is not found (NAK).

Install some Syslinux files

Work in progress

Please forgive the part finished page.

I am writing this one section at a time as I repeat the process myself, with the aim of avoiding publishing mis-remembered details.

See Also