Differences between revisions 126 and 127
Revision 126 as of 2022-01-04 00:13:41
Size: 22585
Editor: ?VassilMladenov
Comment: libvirt-clients is a dependency of libvirt-daemon-system, redundant in install command
Revision 127 as of 2022-01-06 11:14:06
Size: 22587
Editor: PaulWise
Comment: modernise the apt installs
Deletions are marked like this. Additions are marked like this.
Line 22: Line 22:
# apt-get install qemu-system libvirt-daemon-system $ sudo apt install qemu-system libvirt-daemon-system
Line 28: Line 28:
# apt-get install --no-install-recommends qemu-system libvirt-clients libvirt-daemon-system $ sudo apt install --no-install-recommends qemu-system libvirt-clients libvirt-daemon-system

Translation(s): English - Español - 한국어 - Norsk - Русский

(!) Discussion


Introduction

The Kernel Virtual Machine, or KVM, is a full virtualization solution for Linux on x86 (64-bit included) and ARM hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, which provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko.

In Debian, Xen is an alternative to KVM. (VirtualBox is not in Debian main and not in Debian Buster and won't be in Debian Buster-Backports, 794466).

Installation

It is possible to install only QEMU and KVM for a very minimal setup, but most users will also want libvirt for convenient configuration and management of the virtual machines (libvirt-daemon-system - libvirt, virt-manager - a GUI for libvirt). Typically a user should install:

$ sudo apt install qemu-system libvirt-daemon-system

When installing on a server, you can add the --no-install-recommends apt option, to prevent the installation of extraneous graphical packages:

$ sudo apt install --no-install-recommends qemu-system libvirt-clients libvirt-daemon-system

The libvirt-bin daemon will start automatically at boot time and load the appropriate KVM modules, kvm-amd or kvm-intel, which are shipped with the Linux kernel Debian package. If you intend to create Virtual Machines (VMs) from the command-line, install virtinst.

In order to manage virtual machines as a regular user, that user needs to be added to the libvirt group:

# adduser <youruser> libvirt

You should then be able to list your domains, that is virtual machines managed by libvirt:

# virsh list --all

User-specific and system-wide VMs

By default, if virsh is run as a normal user it will connect to libvirt using qemu:///session URI string. This URI allows virsh to manage only the set of VMs belonging to this particular user. To manage the system set of VMs (i.e., VMs belonging to root) virsh should be run as root or with qemu:///system URI:

$ virsh --connect qemu:///system list --all

To avoid having to use the --connect flag on every command, the URI string can be set in the LIBVIRT_DEFAULT_URI environment variable:

$ export LIBVIRT_DEFAULT_URI='qemu:///system'

Creating a new guest

The easiest way to create and manage a VM guest is using a GUI application. Such as:

Alternatively, you can create a VM guest via the command line using virtinst. Below is an example showing the creation of a Buster guest with the name buster-amd64:

virt-install --virt-type kvm --name buster-amd64 \
--cdrom ~/iso/Debian/debian-10.0.0-amd64-netinst.iso \
--os-variant debian10 \
--disk size=10 --memory 1000

Since the guest has no network connection yet, you will need to use the GUI virt-viewer to complete the install.

You can avoid having to download the ISO by using the --location option:

virt-install --virt-type kvm --name buster-amd64 \
--location http://deb.debian.org/debian/dists/buster/main/installer-amd64/ \
--os-variant debian10 \
--disk size=10 --memory 1000

To use a text console for the installation you can tell virt-install to use a serial port instead of the graphical console:

virt-install --virt-type kvm --name buster-amd64 \
--location http://deb.debian.org/debian/dists/buster/main/installer-amd64/ \
--os-variant debian10 \
--disk size=10 --memory 1000 \
--graphics none \
--console pty,target_type=serial \
--extra-args "console=ttyS0"

For a fully automated install look into preseed or debootstrap.

Setting up bridge networking

Between VM guests

By default, QEMU uses macvtap in VEPA mode to provide NAT internet access or bridged access with other guests. This setup allows guests to access the Internet (if there is an internet connection on the host), but will not allow the host or other machines on the host's LAN to see and access the guests.

Between VM host and guests

Libvirt default network

If you use libvirt to manage your VMs, libvirt provides a NATed bridged network named "default" that allows the host to communicate with the guests. This network is available only for the system domains (that is VMs created by root or using the qemu:///system connection URI). VMs using this network end up in 192.168.122.1/24 and DHCP is provided to them via dnsmasq. This network is not automatically started. To start it use:

 virsh --connect=qemu:///system net-start default

To make the default network start automatically use:

 virsh --connect=qemu:///system net-autostart default

In order for things to work this way you need to have the recommended packages dnsmasq-base, bridge-utils and iptables installed.

Accessing guests with their hostnames

After the default network is setup, you can configure libvirt's DNS server dnsmasq, so that you can access the guests using their host names. This is useful when you have multiple guests and want to access them using simple hostnames, like vm1.libvirt instead of memorizing their IP addresses.

First, configure libvirt's default network. Run virsh --connect=qemu:///system net-edit default and add to the configuration the following line (e.g., after the mac tag):

<domain name='libvirt' localOnly='yes'/>

libvirt is the name of the domain for the guests. You can set it to something else, but make sure not to set it to local, because it may conflict with mDNS. Setting hlocalOnly='yes' is important to make sure that requests to that domain are never forwarded upstream (to avoid request loops).

The resulting network configuration should look something like this:

<network connections='1'>
  <name>default</name>
  <uuid>66b33e64-713f-4323-b406-bc636c054af5</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:af:9f:2a'/>
  <domain name='libvirt' localOnly='yes'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>

Now configure the VM guests with their names. For example, if you want to name a guest 'vm1', login to it and run:

sudo hostnamectl set-hostname vm1.libvirt

Next, configure the host's NetworkManager, so that it uses libvirt's DNS server and correctly resolves the guests' hostnames. First, tell NetworkManager to start its own version of dnsmasq by creating a configuration file /etc/NetworkManager/conf.d/libvirt_dns.conf with the following content:

[main]
dns=dnsmasq

Second, tell the host's dnsmasq that for all DNS requests regarding the libvirt domain the libvirt's dnsmasq instance should be queried. This can be done by creating a configuration file /etc/NetworkManager/dnsmasq.d/libvirt_dns.conf with the following content:

server=/libvirt/192.168.122.1

libvirt here is the domain name you set in the configuration of libvirt's default network. Note, the IP address must correspond to libvirt's default network address. See the ip-tag in the network configuration above.

Now, restart the host's NetworkManager with

sudo systemctl restart NetworkManager

From now on the guests can be accessed using their hostnames, like ssh vm1.libvirt.

[Source]

Manual bridging

To enable communications between the VM host and VM guests, you can set up a macvlan bridge on top of a dummy interface similar as below. After the configuration, you can set using interface dummy0 (macvtap) in bridged mode as the network configuration in VM guests configuration.

modprobe dummy
ip link add dummy0 type dummy
ip link add link dummy0 macvlan0 type macvlan mode bridge
ifconfig dummy0 up
ifconfig macvlan0 192.168.1.2 broadcast 192.168.1.255 netmask 255.255.255.0 up

Between VM host, guests and the world

In order to let communications between host, guests and outside world, you may set up a bridge and as described at QEMU page.

For example, you can modify the network configuration file /etc/network/interfaces to setup the ethernet interface eth0 to a bridge interface br0 similar as below. After the configuration, you can set using Bridge Interface br0 as the network connection in VM guest configuration.

auto lo
iface lo inet loopback

# The primary network interface
auto eth0

#make sure we don't get addresses on our raw device
iface eth0 inet manual
iface eth0 inet6 manual

#set up bridge and give it a static ip
auto br0
iface br0 inet static
        address 192.168.1.2
        netmask 255.255.255.0
        network 192.168.1.0
        broadcast 192.168.1.255
        gateway 192.168.1.1
        bridge_ports eth0
        bridge_stp off
        bridge_fd 0
        bridge_maxwait 0
        dns-nameservers 8.8.8.8

#allow autoconf for ipv6
iface br0 inet6 auto
        accept_ra 1

Once that is correctly configured, you should be able to use the bridge on new VM deployments with:

virt-install --network bridge=br0 [...]

Managing VMs from the command-line

You can use the virsh(1) command to start and stop virtual machines. VMs can be generated using virtinst. For more details see the libvirt page. Virtual machines can also be controlled using the kvm command in a similar fashion to QEMU. Below are some frequently used commands:

Start a configured VM guest "VMGUEST":

# virsh start VMGUEST

Notify the VM guest "VMGUEST" to gracefully shutdown:

# virsh shutdown VMGUEST

Force the VM guest "VMGUEST" to shutdown in case it is hung, i.e. graceful shutdown did not work:

# virsh destroy VMGUEST

Managing VM guests with a GUI

On the other hand, if you want to use a graphical UI to manage the VMs, choose one of the following two packages:

Automatic guest management on host shutdown/startup

Guest behavior on host shutdown/startup is configured in /etc/default/libvirt-guests.

This file specifies whether guests should be shutdown or suspended, if they should be restarted on host startup, etc.

The first parameter defines where to find running guests. For instance:

# URIs to check for running guests
# example: URIS='default xen:/// vbox+tcp://host/system lxc:///'
URIS=qemu:///system

Performance Tuning

Below are some options which can improve the performance of VM guests.

CPU

  • Assign virtual CPU core to dedicated physical CPU core
    • Edit the VM guest configuration, assume the VM guest name is "VMGUEST" having 4 virtual CPU core
      # virsh edit VMGUEST
    • Add below codes after the line "<vcpu ..."

      <cputune>
        <vcpupin vcpu='0' cpuset='0'/>
        <vcpupin vcpu='1' cpuset='4'/>
        <vcpupin vcpu='2' cpuset='1'/>
        <vcpupin vcpu='3' cpuset='5'/>
      </cputune>
      where vcpu are the virtual cpu core id; cpuset are the allocated physical CPU core id. Adjust the number of lines of vcpupin to reflect the vcpu count and cpuset to reflect the actual physical cpu core allocation. In general, the higher half physical CPU core are the hyperthreading cores which cannot provide full core performance while have the benefit of increasing the memory cache hit rate. A general rule of thumb to set cpuset is:
    • For the first vcpu, assign a lower half cpuset number. For example, if the system has 4 core 8 thread, the valid value of cpuset is between 0 to 7, the lower half is therefore between 0 to 3.
    • For the second and the every second vcpu, assign its higher half cpuset number. For example, if you assigned the first cpuset to 0, then the second cpuset should be set to 4.

      For the third vcpu and above, you may need to determine which physical cpu core share the memory cache more to the first vcpu as described here and assign it to the cpuset number to increase the memory cache hit rate.

Disk I/O

Disk I/O is usually a performance bottleneck due to its characteristics. Unlike CPU and RAM, a VM host may not allocate a dedicated storage hardware for a VM. Worse, disk is the slowest component among them. There are two types of disk bottleneck: throughput and access time. A modern hard disk can provide 100MB/s throughput which is sufficient for most systems, whereas it can only provide around 60 transactions per seconds (tps).

One way to improve disk I/O latency is to use a small but fast Solid State Drive (SSD) as a cache for larger but slower traditional spinning disks. The LVM lvmcache(7) manual page describes how to set this up.

For the VM Host, you can benchmark different disk I/O parameters to get the best tps for your disk. Below is an example of disk tuning and benchmarking using fio:

  • # echo mq-deadline > /sys/block/sda/queue/scheduler
    # echo 1 > /proc/sys/vm/dirty_background_ratio
    # echo 50 > /proc/sys/vm/dirty_ratio
    # echo 500 > /proc/sys/vm/dirty_expire_centisecs
    # /sbin/blockdev --setra 256 /dev/sda
    # fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/opt/fio.tmp --bs=4k --iodepth=64 --size=8G --readwrite=randrw --rwmixread=75 --runtime=60

For Windows VM guests, you may wish to switch between the slow but cross-platform Windows built-in IDE driver and the fast but KVM specific VirtIO driver. As a result, the installation method for Windows VM guests provided below is a little bit complicated because it provides a way to install both drivers and use one for your needs. Under virt-manager:

  • Native driver for Windows VM guests
    • Create new VM guest with below configuration:
      • IDE storage for Windows OS container, assume with filename WINDOWS.qcow2
      • IDE CDROM, attach Windows OS ISO to CDROM
    • Start VM guest and install the Windows OS as usual
    • Shutdown VM guest
    • Reconfigure VM guest with below configuration:
      • Add a dummy VirtIO / VirtIO SCSI storage with 100MB size, e.g. DUMMY.qcow2
      • Attach VirtIO driver CD ISO to the IDE CDROM

    • Restart VM guest
    • Install the VirtIO driver from the IDE CDROM when Windows prompt for new hardware driver
    • For VM guest of Windows 10 and above
      • Run "cmd" as Administrator and run below command
        > bcdedit /set {current} safeboot minimal
    • Shutdown VM guest
    • Reconfigure VM guest with below configuration:
      • Remove IDE storage for Windows OS, DO NOT delete WINDOWS.qcow2
      • Remove VirtIO storage for dummy storage, you can delete DUMMY.qcow2
      • Remove IDE storage for CD ROM
      • Add a new VirtIO / VirtIO SCSI storage and attach WINDOWS.qcow2 to it
    • Restart the VM guest
    • For VM guest of Windows 10 and above
      • Login the safe mode of Windows 10 VM guest and run below command
        > bcdedit /deletevalue {current} safeboot
      • Restart the VM guest
  • Native driver for Linux VM guests
    • Select VirtIO / VirtIO SCSI storage for the storage containers
    • Restart the VM guest
  • VirtIO / VirtIO SCSI storage
    • VirtIO SCSI storage provides richer features than VirtIO storage when the VM guest is attached with multiple storage. The performance are the same if the VM guest was only attached with a single storage.
  • Disk Cache
    • Select "None" for disk cache mode, "Native" for IO mode, "Unmap" for Discard mode and Detect zeroes method.
  • Dedicate I/O Threads
    • Specifying I/O thread can reduce blocking symptom during disk I/O significantly. 1 I/O thread is sufficient for most cases:
    • Edit the VM guest configuration, assume the VM guest name is "VMGUEST"
      # virsh edit VMGUEST
    • After the first line "<domain ...>", add "iothreads" line:

        <iothreads>1</iothreads>
    • After the line of disk controller, for example, for Virtio-SCSI controller, after the line "<controller type='scsi' ...>", add "driver" line:

        <driver iothread='1'/>

Network I/O

Using virt-manager:

  • Native driver for Windows VM guests
    • Select VirtIO for the network adapter
    • Attach VirtIO driver CD ISO to the IDE CDROM

    • Restart the VM guest, Windows found a new network adapter hardware, install the VirtIO driver from the IDE CDROM
  • Native driver for Linux VM guests
    • Select VirtIO for the network adapter
    • Restart the VM guest

Memory

  • Huge Page Memory support
    • Calculate the huge page counts required. Each huge page is 2MB size, as a result we can use below formula for the calculation.
      Huge Page Counts = Total VM Guest Memory In MB / 2
      e.g. 4 VM guests, each VM guest using 1024MB, then huge page counts = 4 x 1024 / 2 = 2048. Note that the system may be hang if the acquired memory is more than that of the system available.
    • Configure ?HugePages memory support by using below command. Since Huge memory might not be allocated if it is too fragmented, it is better to append the code to /etc/rc.local

      echo 2048 > /proc/sys/vm/nr_hugepages
      mkdir -p /mnt/hugetlbfs
      mount -t hugetlbfs hugetlbfs /mnt/hugetlbfs
      mkdir -p /mnt/hugetlbfs/libvirt/bin
      systemctl restart libvirtd
    • Reboot the system to enable huge page memory support. Verify huge page memory support by below command.
      # cat /proc/meminfo | grep HugePages_
      HugePages_Total:    2048
      HugePages_Free:     2048
      HugePages_Rsvd:        0
      HugePages_Surp:        0
    • Edit the VM guest configuration, assume the VM guest name is "VMGUEST"
      # virsh edit VMGUEST
    • Add below codes after the line "<currentMemory ..."

      <memoryBacking>
        <hugepages/>
      </memoryBacking>
    • Start the VM guest "VMGUEST" and verify it is using huge page memory by below command.
      # virsh start VMGUEST
      # cat /proc/meminfo | grep HugePages_
      HugePages_Total:    2048
      HugePages_Free:     1536
      HugePages_Rsvd:        0
      HugePages_Surp:        0
      Huge Page Counts = Total VM Guest Memory In MB / 2

Migrating guests to a Debian host

Migrating guests from RHEL/CentOS 5.x

There are a few minor things in guest XML configuration files (/etc/libvirt/qemu/*.xml you need to modify:

  • Machine variable in <os> section should say pc, not rhel5.4.0 or similar

  • Emulator entry should point to /usr/bin/kvm, not /usr/libexec/qemu-kvm

In other words, the relevant sections should look something like this:

  <os>
    <type arch='x86_64' machine='pc'>hvm</type>

  --- snip ---

  <devices>
    <emulator>/usr/bin/kvm</emulator>

If you had configured a bridge network on the CentOS host, please refer to this wiki article on how to make it work on Debian.

Troubleshooting

No network bridge available

virt-manager uses a virtual network for its guests, by default this is routed to 192.168.122.0/24 and you should see this by typing ip route as root.

If this route is not present in the kernel routing table then the guests will fail to connect and you will not be able to complete a guest creation.

Fixing this is simple, open up virt-manager and go to "Edit" -> "Host details" -> "Virtual networks" tab. From there you may create a virtual network of your own or attempt to fix the default one. Usually the problem exists where the default network is not started.

cannot create bridge 'virbr0': File exists:

To solve this problem you may remove the virbr0 by running:

brctl delbr virbr0

Open virt-manager and go to "Edit" -> "Host details" -> "Virtual networks" start the default network.

You can check the netstatus

virsh net-list --all

Optionally, you can use bridge network BridgeNetworkConnections

Windows guest frequently hang or BSOD

Some Windows guest using some high-end N-way CPU may found frequently hang or BSOD, this is a known kernel bug while unfortunately not fixed in Jessie (TBC in Stretch). Below workaround can be applied by adding a section <hyperv>...</hyperv> in the guest configuration via command virsh edit GUESTNAME:

<domain ...>
  ...
  <features>
    ...
    <hyperv>
      <relaxed state='on'/>
    </hyperv>
  </features>
  ...

See also

You can find an example for testing. You can't do it remotely.

External links

Please, add links to external documentation. This is not a place for links to non-free commercial products.


CategorySystemAdministration | CategoryVirtualization | CategorySoftware