My virtual test machine setup

As part of my work I have to keep and maintain a set of virtual machines I use for testing. Testing often requires that you have a clean setup of a certain operating system which is up-to-date to run some tests on it. After testing you would like to return to a defined clean state and get rid of all the changes that you made while troubleshooting a certain package that went rough and touched literally every single configuration file in /etc and whatnot.

This is a problem that probably every tester encounters at some point: How to keep and update a clean set of throwaway virtual machines?

# First (manual) setup

On my work laptop I currently keep the following virtual machines for testing

$ virsh list --all
 Id   Name                State
------------------------------------
 -    jump                shut off
 -    leap15_1            shut off
 -    leap15_2            shut off
 -    openqa              shut off
 -    sle15sp1            shut off
 -    sle15sp1-aarch64    shut off
 -    sle15sp2            shut off
 -    sles15sp1-ppc64le   shut off
 -    tumbleweed          shut off

Ignore the aarch64 and ppc64le machines, they are special and are not covered by automation. It would in principle also work for them.

I installed every one of the manually and took a fresh snapshot immediately after installation (and registration for SUSE Linux Enterprise products). Then I updated the systems every now and then and make a snapshot after every update.

This process works pretty well. Before I start a test I restore the previous snapshot (e.g. updated-20200508) and I have my sandbox which allows me to do whatever I want. After testing I just restore the snapshot and all is good.

But keeping those machines up-to-date was always something a bit unsettling. I did this manually for months, until realizing that the overhead of manual update justifies to spend some time in order to automate the process.

# Automating the process

In particular, I wanted the following

  • Restore the machine to it’s previous snapshot, in case I forgot to do that
  • Update one machine after another, as I don’t have enough RAM to boot all at the same time
  • Install all updates, do a reboot and check, if the machine boots (smoketest)
  • Only if this simple smoketest succeeds, take a new snapshot
  • Delete old snapshots, keep the last 3
  • Only modify the automated snapshots (identified by the prefix updated-), leave potential manual snapshot untouched

To achieve this, I wrote two little scripts: update_vm.sh which performs the update on a single virtual machine, and update_all_vms, which performs the updates on once machine after another.

The result is amazing: During lunchtime I just fire up update_all_vms and while I’m away my machine fully automatically updates all of my virtual test machines for me. I don’t have to spent any more time on doing this monkey job of manually going through a bunch of machines. This is awesome and it saves me a lot of time.

And it allows me to scale to even more virtual test machines, if I have to. The maintenance overhead for those machines, at least for the human being, does not increase and this is what makes this effort paying off in the end.

I’ve been using these scripts for a couple of weeks now, and I must say I’m super happy with them. Wouldn’t go back to do this manually ;-)

This script runs as cron job during lunchtime, so I never have to bother about updating my test machines. Running machines will be skipped, so I don’t loose the current state if I’m working on something right now.

Example workflow in a terminal showing how two virtual machines get updates with the script

# Pre-flight configuration

Before the script will work, please ensure the following requirements are met:

  • all virtual machines have hostnames, which are in /etc/hosts
  • passwordless root ssh access to the virtual machines

We rely on a resolvable hostname, as the script needs to connect to the machine. This means in particular the virtual machine name must be the same as the hostname and the hostname needs to be resolvable in /etc/hosts

# TL;DR

  • I automated the process of updating my virtual test machines
  • The scripts update the virtual machines one after another to preserve RAM
  • After installing the updates, a reboot and a simple smoketest is done
  • The script creates a new snapshot for the virtual machine, but only if the update succeeds
  • Old snapshots will be deleted, the last 3 are kept

You find the scripts below (See section “Bash scripts”). You will need to update the machines in update_all_vms to your needs, then it is as simple as

update_all_vms

In order to work, all virtual machines need to have hostnames which are configured via /etc/hosts (or equivalent) and passwordless root ssh access needs to be possible.

I run this as a cron job during my lunch break :-)

# Bash scripts

Checkout the files in my utils github repo. The Cron jbo is described below.

# update_vm.sh - [github_repo]

The following file is to update a single vm. Usage: update_vm.sh VIRTUALMACHINE [UPDATECOMMAND]. Examples for various distributions:

./update_vm "leap_15_2" "zypper ref && zypper patch -y"
./update_vm "tumbleweed" "zypper ref && zypper dup -y"

./update_vm "ubuntu_bionic" "apt-get update && apt-get upgrade -y"
./update_vm "debian_buster" "apt-get update && apt-get upgrade -y"

The second argument UPDATECOMMAND is optional and defines the command that is run for updating the virtual machine. The default is zypper ref && zypper patch and suitable for openSUSE Leap only. If you want to update Tumbleweed or Debian/Ubuntu/CentOS/Whatever you need to define the update commands here.

~/bin/update_vm.sh - download from github

#!/bin/bash
# Summary: Bash script to update a virtual machine and create a new snapshot for the VM
# Steps:
#   * Restore last snapshot for machine
#   * Boot the machine
#   * Install updated
#   * Reboot & smoke test
#   * Shutdown the machine
#   * Create new snapshot
#   * Delete old snapshots, keep only the last 3

# Snapshot prefix
SN_PREFIX="updated"

# Number of snapshots to keep
SNAPSHOTS=3
TIMEOUT=300

# get the last snapshot of the given domain
function last_snapshot {
    domain="$1"
    virsh snapshot-list --name --domain "$domain" | grep "${SN_PREFIX}-" | sort | tail -n 1
}

# wait until the given machine is booted (i.e. when ssh becomes available)
function boot_await {
    domain="$1"
    for i in `seq 1 $TIMEOUT`; do
	    if ping -c 1 "$domain" >/dev/null; then
		    # ssh also needs to be up
		    if nmap -Pn -p ssh "$domain" | grep ssh | grep open >/dev/null; then return 0; fi
	    fi
	    sleep 1
    done
    return 1
}

# wait until the given machine is shut down
function await_shutdown {
    domain="$1"
    for i in `seq 1 $TIMEOUT`; do
	    if ! virsh list --name | grep "$domain" >/dev/null; then return 0; fi
	    sleep 1
    done
    return 1
}

# remove old snapshots of the given machine. Keep $SNAPSHOTS snapshots
function remove_old_snapshots {
    domain="$1"
    snapshots=`virsh snapshot-list "$domain" --name | grep "${SN_PREFIX}-" | sort | head -n -$SNAPSHOTS`
    for snapshot in $snapshots
    do
	    virsh snapshot-delete --domain "$domain" --snapshotname "$snapshot"
    done
}

# Cleanup routine at the end of the script
# Meant as cleanup in error cases
function cleanup() {
    set +e
    # ensure VM is powered off
    if virsh list --name | grep "$domain"; then virsh destroy "$domain" || true; fi
}



#### Main script routine ######################################################

if [[ $# -lt 1 ]]; then
    echo "Usage: $0 DOMAIN"
fi

set -e
domain="$1"
if [[ $# -gt 1 ]]; then
    update_cmd="$2"
else
    update_cmd="zypper ref && zypper patch -y"
fi
## Check if running
if virsh list | grep "$domain" >/dev/null; then
    echo "Cowardly refusing to update $domain (vm currently in use). Please stop the machine first"
    exit 1
fi
trap cleanup EXIT

## Restore last snapshot
snapshot="`last_snapshot $domain`"
if [[ $snapshot != "" ]]; then
    # Check if the last snapshot is of today
    if [[ "$snapshot" == "$SN_PREFIX-`date --iso`" ]]; then
	    echo "Cowardly refusing to overwrite a already updated snapshot"
	    echo "If you still need to update the snapshot, remove the old one manually:"
	    echo ""
	    echo "    virsh snapshot-delete --domain \"$domain\" --snapshotname \"$snapshot\""
	    echo ""
	    echo "Error: snapshot $snapshot already exists"
	    exit 1
    fi
    echo "Restoring snapshot '$snapshot' ... "
    virsh snapshot-revert "$domain" "$snapshot"
else
    echo "No previous snapshot for restoring found."
fi
## Start machine, install updates and reboot the machine
# after reboot, we also perform a small smoke test to ensure the update didn't go horribly wrong
echo "Booting $domain ... "
virsh start "$domain" >/dev/null
boot_await "$domain"
echo "Installing updates ... "
ssh "root@$domain" "$update_cmd"
echo "Rebooting machine ... "
ssh "root@$domain" reboot >/dev/null || true
boot_await "$domain"
sleep 10		# give the VM some time
# Smoke tests
echo "Running smoke tests on machine ... "
ssh "root@$domain" uname -a >/dev/null
ssh "root@$domain" shutdown -h now >/dev/null || true 
## Shut the machine down and create new snapshot
await_shutdown "$domain"
sleep 2		# just to be sure
echo "Create snapshots"
virsh snapshot-create-as --domain "$domain" --name "${SN_PREFIX}-`date --iso`"
## Delete old snapshots
echo "Delete old snapshots (keeping $SNAPSHOTS) ... "
remove_old_snapshots "$domain"
snapshot="`last_snapshot $domain`"
echo "Done. Latest snapshot: $snapshot"

# update_all_vms

This script allows me to update all of my virtual machines one after another. The script is tailor made for my virtual machines and serves as a template to adapt it to your needs.

~/bin/update_all_vms

#!/bin/bash
# Bash script to update all of my virtual machines

echo "Updating the following virtual machines: "
echo "  Debian 10 Buster"
echo "  openSUSE Leap 15.2"
echo "  openSUSE Tumbleweed"

echo "Hit return to continue"
read

# For cron job the absolute path is required
~/bin/update_vm "test_debian10" "apt-get update && apt-get upgrade -y"
~/bin/update_vm "test_leap15_2" "zypper ref && zypper patch -y"
~/bin/update_vm "test_tumbleweed" "zypper ref && zypper dup -y"

# Cron job

In this section my cron job setup is described. This is based on the previous scripts but requires some small adjustments.

# crontab -e
15 12 * * * /root/bin/update_all_vms

IMPORTANT: In order for the cron job to work, you need to have a (passwordless) root ssh key or a procedure with your ssh-agent. I chose the former, as everything is contained on my single machine and the key is not used elsewhere.

# /root/bin/update_all_vms

This bash script is similar to the previous update_all_vms, except you need to have the absolute path, and export LIBVIRT_DEFAULT_URI, to tell virsh where to connect to.

#!/bin/bash
# Bash script to update all of my virtual machines

export LIBVIRT_DEFAULT_URI=qemu:///system

BIN="/root/bin/update_vm"

VMS="leap15_1 leap15_2 sle15sp1 sle15sp2"


for vm in $VMS; do
        $BIN "$vm" "zypper ref && zypper patch -y"
done
# Tumbleweed needs zypper dup
$BIN "tumbleweed" "zypper ref && zypper dup -y"

I’m using this setup on my work laptop since some time and it’s just super nice to always have a fresh and ready-to-be-used virtual machine for testing on stock. This is IMHO a must have for every developer and sysadmin - you always need a test system to test some new tools, or to investigate a bug or something. Keep your main machine clean by doing the dirty work in throwaway virtual machines.

And: Have a lot of fun!


[Update 30.11.2020]

  • Corrected some minor typos. Thanks Chris for input and you corrections
  • Changed zypper up with zypper patch on Leap and SLE - Thanks Martin for pointing this out

[Update 15.12.2020]

I’ve also added the Cron job section