Machine Virtualization As a Development Tool: Part 1

Prologue

The use of Virtual Machines for web development is becoming ubiquitous, and with good reason.

This is the first of a two article series about using machine virtualization as a development tool. It focuses on the question of what. The second article is found here and focuses on the question of how.

What are Virtual Machines for a software developer?

A Virtual Machine (VM) is a software based abstraction of a physical machine. This abstraction makes it possible to manage and control access to the actual hardware. Hardware, that does not physically exist, can be emulated.

Here is how I recall the explosion in use of Virtual Machines…

When I first became a professional software developer I worked for IBM. The first VM I used for development was VM/CMS. It was a remote, centrally managed VM, and a sensible abstraction away from big iron mainframes with big consequences if something went wrong.

It was not too long until the PC revolution changed the way applications were used, written and delivered. The 3270 terminal (or terminal emulator) was largely replaced with the web browser and Client-Server applications. It was around that time that I displayed a picture on my office wall of a proud man standing in a giant warehouse stacked with PC servers. His arms were spread so one could get a feel of scale; a tiny man standing in such a giant warehouse with an incalculable number of servers. The picture was from a book where the man described his successful journey to replace a mainframe with PC servers. This was IBM, so I am sure the message I should have gained from the book was something like “Wow look at all the hardware that people would buy, and all the services they would need, to support their desire to join this PC revolution!”

There is something to be said for the decentralization of application development accelerating innovation, but what I saw in the picture was “Maintenance nightmare!”

If I recall correctly, the move from centrally managed mainframes to cheap distributed PC infrastructure, this PC revolution, and its requisite services, were quite profitable for IBM. There must be some irony there.

It was a while later before PC architecture servers, with enough surplus power to support virtualization, became cost effective. Until that time the big boys (think IBM, HP, Sun, and many others) continued to offer expensive, but powerful machines with hardware based virtualization (type 1 hypervisor), while the PC guys began to learn the true costs of maintaining so many physical machines.

What do you do if you want the benefits of virtualization but your hardware (architecture) does not meet the Popek and Goldberg virtualization requirements? Well, you write a software hypervisor (type 2 hypervisor) like VMWare, and change the world. This kind of VM software made it possible to emulate a number of PC servers on a larger host, with room to scale up as needs grew. Surplus power could then be traded for better maintainability, reliability, predictability and separation.

All that brings us to today where VMs (Virtual Machines) are used throughout the lifecycle of application development and deployment, especially for web applications where PC hardware is still largely ubiquitous.

There are a variety of good VM solutions, but for the sake of simplicity I will focus the rest of this article on one, VirtualBox. VirtualBox is currently free and can be found here.

What can it do for me?

OK, that is all well and good for system administrators and infrastructure planners, but what can it do for me as a developer?

Reduce maintenance time

Changing a network card or adding RAM to a virtual machine is as simple as changing a setting. This is true whether the host is a remotely maintained server, or your laptop. Virtual Machines are portable, so if the physical hardware is acting up, the VM can be moved to another host until the original is repaired or replaced. One could even continue to work on a clone while software is being upgraded on an original VM. Need a development server and all existing servers are at capacity? No problem, simply fire up a new VM.

It is quicker and easier to update a VM in concert with a closely matching production machine, than it is to update a number of physical test machines, developer laptops, workstations, etc. This is not simply a matter of scale, updating a single image rather than a number of physical boxes, but also a matter of compatibility. The compatibility becomes important if it permits one to apply the same changes to production and test VMs with tools such as Puppet, with predictable results.

Reduced maintenance and provisioning time result in increases of developer productivity, even if the developer never deals with the maintenance or provisioning him or her self.

Improve testing

There are characteristics of VMs that are useful for testing. VM images are software, so they only need to consume resources when they are needed, unlike their physical counterparts. This makes it cheap to keep many test images. Because VMs are portable and can be copied and cloned, it is simple to make them available to whomever needs them, even to many people simultaneously.

Undo changes with roll-back

VMs make it quick and safe to test dramatic environmental changes. This is because changes can be completely undone with a simple roll-back.

I recall a third party software upgrade that resulted in the upgraded server becoming completely unusable. Performing an un-install would have still left the server damaged. A roll-back restored the server to the state exactly as it was prior to the upgrade. I can not imagine the nightmare had this server not been a VM.

Better match

Testing is more efficient when all factors, other than those being tested, are consistent and thus eliminated from consideration. This way when differences are observed they can be expected to be the result of the factors being tested. Nobody wants to experience a problem in production that could not be found in test, but when these environments are different, and the differences are not the factors being tested, such problems can happen.

It is possible for one to create a VM that more closely matches production (also likely a VM) than it would be to configure one’s laptop, workstation, or even a physical test machine to do the same. This is in part because a laptop or workstation is designed to meet different needs than for example, a dedicated web server.

An example might be that one’s development workstation runs archlinux and one’s hosting provider, or a company production environment, uses Centos 6. To address the discrepancy, one could create a Centos 6 base VM that closely matches production. In some cases such a VM can be derived from the actual VM used in production.

At this point I should mention that many developers use an IDE (Integrated Development Environment) both for writing code and for local testing. I avoid IDEs. To me, they divert one’s attention from learning the language and development tools, to learning the IDE. I find that I get better results by writing a script, an editor macro or by re-examining what I am doing, than I do from using an IDE shortcut.

I worked with a developer who experienced a really frustrating and intractable code problem. We found that the problem was not with his code but was simply a flaw in the way his IDE executed it. The problem did not, and could not, exist anywhere else.

I bring this up because I find that running code, especially tests, in an IDE exacerbates the problems of run-time differences between a local development and production environment. If you are hell-bent on using an IDE, I suspect you could configure it to use a local VM for testing, etc. After all, development tools should conform to the needs of the developer, not the other way around.

Make better use of your local machine

Many developers are familiar using VMs as part of a remote, company development server pool, but few use VMs locally. VM solutions work well on local machines too. Independent contractors have been taking advantage of virtual machines for years. Visit a new client and create a fresh VM. That VM can then be customized to the client and saved for the next time the client calls. Meanwhile, the host machine is kept safe from constant alteration. This also works well when one encounters a client running a different OS. Have a Mac or Linux laptop and the customer runs Windows or FreeBSD? No problem, just use a VM of whatever is needed.

The second article in this series will focus on how to use machine virtualization as a development tool.