Scripted FreeBSD reimaging
Recently at work I’ve been working on a system to automatically revert FreeBSD systems to a known good state – something like a VM snapshot, but for physical machines too. The system is reasonably fast, portable across different hardware configurations and very easy to use. It’s turned out that this system has worked quite well, and although reasonably simple is the result of much research and time, so I’m posting some pointers here.
The Problem
The push for this came from a need to improve the testing environments our developers have been using. The old testing environments were VMs, more or less handmade at some point in the distant past to provide a rough approximation of our live environment. Since that time the developers had mostly looked after them themselves, tweaking, fixing and forking the VMs as they saw fit. New developers would copy an existing developer’s VM, and branch off from there.
This system was bad. There were problems ranging from minor version differences all the way up to entirely missing subsystems, not to mention the total lack of documentation. Every developer changed their VM slightly differently, so troubleshooting problems was as much exploration and discovery as debugging. Worse, the slowly growing number of environments made keeping them in sync (or something approximating that) with the live environment increasingly difficult. Something had to be done.
The Solution
What I’ve ended up going with is a very basic system that builds on as much of our previous infrastructure as possible. The key component is our ‘live environment image’, an OS image that we base all our live servers off. The new development environments are also based off this image, and the solution consists of a set of scripts that automate the conversion of the live environment image into a development environment. So that’s the high-level concept, how do the nitty gritty details work?
The system is made up of two separate installs of FreeBSD on one disk. The first install is a 5gb disk slice, containing a minimal install of FreeBSD. This is the Reimaging OS. The second install is on another slice taking up the rest of the disk and is the actual Development Environment.
The Reimaging OS
This copy of FreeBSD isn’t just a fresh install. There are a few changes, the most important are the installation of bash and these two lines added to the machine’s /etc/rc.local file, which start the magic:
scp -i /root/key reimageuser@fileserver:dev-environment-reimage.sh /tmp/ /usr/local/bin/bash /tmp/dev-environment-reimage.sh
And what does this script do? Well, in short:
- Format the partitions making up the Development Environment.
- Restore the base ‘live environment image’ into the Development Environment.
- Modify the fresh Development Environment to actually be suitable for use as a Development Environment, rather than a live server.
- Change the default boot slice to the second slice, and reboot the machine.
So now the machine reboots, and when the bootloader starts up, it loads…
The Development Environment
Remember step three above? The hand-wavey “turn the live environment into a dev. environment” step? Well, you can’t fully complete that step from the Reimaging OS. There are some things you just don’t know. What hostname should you set for the machine? What email address should all outgoing mail be redirected to? It turns out that the first time a dev. environment boots, the developer has to answer some questions. After developers reimage their machine, the final step is to run a script on first login: dev-environment-rechristen.sh. This script goes through and makes all the changes to the system that require user input, plus a few workplace specific changes.
Of course, the developer also needs to be able to kick off a reimage of their environment at some point in the future. How do they do that? Another script sets the default boot OS to the Reimaging OS and reboots the machine. Developer goes away for a coffee and comes back to a fresh machine.
The Scripts
So that’s how it works, how about some sample code? Please note that these are edited versions of the scripts I run with anything even vaguely revealing about our configuration/infrastructure stripped out. They’re a starting point for you, but you still need to do a fair amount of work to get these scripts working for you.
- reimage-setup.txt. This is a short script I wrote to help quickly deploy the Reimaging OS to new machines. Put it on a thumbdrive with a FreeBSD dump(8) image, then boot off a FreBSD install CD, and select Fixit -> Live CD Filesystem. Mount the USB key at /mnt (something like mount /dev/da0s1 /mnt) and then run the script like so: /mnt/reimage-setup.sh /dev/ad0 (where /dev/ad0 is the drive you want to use for the dev. environment. The script (or something like it) has worked well for me across a number of heterogeneous systems, but, of course, YMMV. For the record, the rest of these scripts are written in bash, so make sure that’s available in your OSes.
- reimage-do.txt. This script is the basic skeleton of our dev-environment-reimage.sh script. It shares many of the assumptions the previous script makes: what the partitions are, where they are located and so on. I’ve excised the more site-specific customisation, but it’s ready to go otherwise.
- reimage-begin.txt. This script completes the trio. It’s what you run from within the development environment to kick off the reimaging process. Nice and simple, and again shares all the assumptions the previous two scripts made.
- Bonus: a starting point for your reimage-rechristen.sh script. A couple of functions that ours does. Most of the rest of our script does site-specific work, like checking some files out of an SVN repository, changing the hostname in a few config files, and so on.
4 Responses to “Scripted FreeBSD reimaging”
Leave a Reply
I don’t see why the scripts require bash o_O
Also, when ready, (and when using sufficient system) using zfs for this kind of task would be better. Maybe.
Interesting traditional approach — I’ll plug my personal approach to a similar problem: radmind (http://sourceforge.net/projects/radmind/).
We use this at my site to manage deployment and patching of the production environment, and when a dev environment is needed we install a skeletonized OS (FreeBSD + the radmind package), point it at the radmind server and let it deploy as it would in production.
When developers make changes we can then run the radmind fsdiff tool and determine exactly what needs to be updated in production. If they reach a point where their system is unusable they can re-apply the production loadset and get back to an exact copy of our production environment.
Something from your procedure above that I’m missing is the “rechristen” script to eliminate the need to edit rc.conf and friends (which I intend to steal now).
—
My notes on deploying radmind along with some example scripts for the interested: http://www.bsd-box.net/~mikeg/blog/index.php?serendipity%5Baction%5D=search&serendipity%5BsearchTerm%5D=radmind&serendipity%5BsearchButton%5D=%3E
edogawaconan: I used bash because the ‘cost’ (ie, time taken installing the package) is minor and I much prefer the dialect. zfs will only become suitable when (ha!) our production environment moves to zfs as well, it would be far too big of a difference before then.
Michael: interesting tool, thanks for the link! I had a quick flick through your writeup and will poke the tool more later. As I understand it, it’s sort of an OS-aware rsync-like tool, so there’s no support for running arbitrary commands on the client as part of the deploy process?