VirtualBox extensions for MAAS

During the last season’s holidays, I spent some time cleaning up demos and code that I use for my daily activities at Canonical. It’s nothing really sophisticated, but for me, and I suspect for some others too, a small set of scripts makes a big difference.

In my daily job, I like to show live demos and I need to install a large set of machines, scale workloads, monitor and administer servers and data centres. Many people I meet don’t want to know only the theoretical details, they want to see the software in action, but as you can imagine, the process of spinning up 8 or 10 machines and install and run a full version of OpenStack in 10-15 minutes, while you also explain how the tools work and perhaps you even try to give suggestions on how to implement a specific solution, is not something you can handle easily without help. Yet, that is what CTOs and Chief Architects want to know in order to decide whether a technology is good or not for them.

At Canonical, workloads are orchestrated, provisioned, monitored and administered using MAAS, Juju and Landscape, around Ubuntu Cloud, which is the Canonical OpenStack offering. These are the products that can do the magic of what I described, but providing in minutes something that usually takes days to install, set up and run.

In addition to this long preface, I am an enthusiastic Mac user. I do love and prefer Ubuntu software and I am not entirely happy with many technical decisions around OS X, but I also found Mac laptops to be a fantastic hardware that simply fits my needs. Unfortunately, the KVM porting to OS X is not available yet, hence the easiest and most stable way to spin up Linux VMs in OS X is to use VMWare Fusion, Parallels or VirtualBox. Coming from Sun/Oracle and willing to use open source software as much as I can, VirtualBox is my favourite and natural choice.

Now, if you mix all the technologies mentioned above, you end up with a specific need: the integration of VirtualBox hosts, specifically running on OS X (but not only), with Ubuntu Server running MAAS. The current version of MAAS (1.5 GA in the Ubuntu archives and 1.7 RC in the maintainers branch), supports virsh for power management (i.e. you can use MAAS to power up, power check and power down your physical and virtual machines), but the VirtualBox integration with virsh is limited to socket communication, i.e. you cannot connect to a remote VirtualBox host, or in other words MAAS and VirtualBox must run in the same OS environment.

Connections to local and remote VirtualBox hosts
Connections to local and remote VirtualBox hosts

 

My first instinct was to solve the core issue, i.e. add support to remote VirtualBox hosts, but I simply did not have enough bandwidth to embark on such an adventure, and becoming accustomed to the virsh libraries would have taken a significant amount of time. So I opted for a smaller, quicker and dirtier approach: to emulate the most simple power management features in MAAS using scripts that would interact with VirtualBox.

MAAS – Metal As A Service, the open source product available from Canonical to provision bare metal and VMs in data centres, relies on the use of templates for power management. The templates cover all the hardware certified by Canonical and the majority of the hardware and virtualised solutions available today, but unfortunately they do not specifically cover VirtualBox. For my workaround, I modified the most basic power template provided for the Wake-On-LAN option. The template simply manages the power up of a VM, and leaves the power check and power down to other software components.

The scripts I have prepared are available on my GitHub account, and are licensed under GPL v2, so you are absolutely free to download it, study it, use it and, even more important, provide suggestions and ideas to improve them.

The README file in GitHub is quite extensive, so I am not going to replicate here what has been written already, but I am going to give a wider architectural overview, so you may better consider whether it makes sense to use the patches or not.

MAAS, VirtualBox and OS X

The testing scenario that I have prepared and used includes OS X (I am still on Mavericks as some of the software I need does not work well on Yosemite), VirtualBox and MAAS. What I need for my tests and demos is shown in the picture below. I can use one or more machines connected together, so I can distribute workloads on multiple physical machines. The use of a single machine makes things simpler, but of course it puts a big limitation to the scalability of the tests and demos.

A simplified testbed with MAAS set as VM that can control other VMs, all within a single OS X VirtualBox Host machine
A simplified testbed with MAAS set as VM that can control other VMs, all within a single OS X VirtualBox Host machine

The basic testbed I need to use is formed by a set of VMs prepared to be controlled by MAAS. The VMs are visible in this screenshot of the VirtualBox console.

VirtualBox Console
VirtualBox Console

Two aspects are extremely important here. First, the VMs must be connected using a network that allows direct communication between the MAAS node and the VMs. This can be achieved locally by using a host-only adapter where MAAS provides DNS and DHCP services and each VM has the Allow All option set in the communication mode combo.

VirtualBox Network

Secondly, VMs must have PXE boot set on. In VirtualBox, this is achievable by selecting the network boot option as the first option available in the system tab.

VirtualBox Boot Options

 

In this way, the VMs can start the very first time and can PXE Boot using a cloud image provided by MAAS. Once MAAS has the VM enlisted as a node, administrators can edit the node via the WEB UI, the CLI app or the RESTful API. Apart from changing the name, what is really important is the setting of the Power mode and the physical zone. The power mode must be set as Wake-On-LAN and the MAC address is the last part of the VM id in VirtualBox (with colons). The Physical zone must be associated to the VirtualBox Host machine.

MAAS Edit NodeMAAS Edit Node

In the picture above the Physical zone is set as izMBP13. The description of the Physical zone must contain the UID and the hostname or IP address of the host machine.

Physical Zone

Once the node has been set properly, it can be commissioned by simply clicking the Commission node button in the Node page. If the VM starts and loads the cloud image, then MAAS has been set correctly.

The MAAS instance interacts with the VirtualBox host via SSH and with responds to PXE Boot requests from the VMs
The MAAS instance interacts with the VirtualBox host via SSH and with responds to PXE Boot requests from the VMs

A quick look at the workflow

The workflow used to connect MAAS and VM is relatively simple. It is based on the components listed below.

A. MAAS control

Although I have already prepared scripts to Power Check and Power Off the VM, at the moment MAAS can only control the Power On. Power On is executed by many actions, such as Commission node or the explicit Start node in MAAS. You can always check the result of this action by checking the event log in the Node page.

09 MAAS Node

B. Power template

The Power On action is handled through a template, which in the case of Wake-On-LAN and of the patched version for VirtualBox is a shell script.

The small fragment of code used by the template is listed here and it is part of the file /etc/maas/templates/power/ether_wake.template:

...
if [ "${power_change}" != 'on' ]
then
...
elif [ -x ${home_dir}/VBox_extensions/power_on ]
then
    ${home_dir}/VBox_extensions/power_on \
    $mac_address
...
fi
...

C. MAAS script

The script ${home_dir}/VBox_extensions/power_on is called by the template. This is the fragment of code used to modify the MAC address and to execute a script on the VirtualBox Host machine:

...
vbox_host_credentials=${zone_description//\"}

# Check if there is the @ sign, typical of ssh
# user@address
if [[ ${vbox_host_credentials} == *"@"* ]]
then
  # Create the command string
  command_to_execute="ssh \
    ${vbox_host_credentials} \
    '~/VBox_host_extensions/startvm \
     ${vm_to_start}'"
  # Execute the command string
  eval "${command_to_execute}"
...

D. VirtualBox host script

The script in ~/VBox_host_extensions/startvm is called by the MAAS script and executes the stratvm command locally:

...
start_this_vm=`vboxmanage list vms \
| grep "${1}" \
| sort \
| head -1`
start_this_vm=${start_this_vm#*\{}
start_this_vm=${start_this_vm%\}*}
VBoxManage startvm ${start_this_vm} \
           --type headless
...

The final result will be a set of VMs that are then ready to be used for example by Juju to deploy Ubuntu OpenStack, as you can see in the image below.

MAAS Nodes (Ready)

 

Next Steps

I am not sure when I will have time to review the scripts, but they certainly have a lot of space for improvement. First of all, by adopting a richer power management option, MAAS will not only power on the VMs, but also power off and check their status. Another improvement regards the physical zones: right now, the scripts loop through all the available VirtualBox hosts. Finally, it would be ideal to use the standard virsh library to interact with VirtualBox. I can’t promise when, but I am going to look into it at some point this new year.

It does not matter if Aurora performs 1x or 10x MySQL: it _is_ a big thing

I spent the last 4 years at SkySQL/MariaDB working on versions of MySQL that could be “suitable for the cloud”. I strongly believed that the world needed a version of MySQL that could work in the cloud even better than its comparable version on bare metal. Users and administrators wanted to benefit from the use of cloud infrastructures and at the same time they wanted to achieve the same performance and overall stability of their installations on bare metal. Unfortunately, ACID-compliant databases in the cloud suffer from the issues that any centrally controlled and strictly persistent system can get when hosted on highly distributed and natively stateless infrastructures.

In this post I am not going to talk about the improvements needed for MySQL in the cloud – I will tackle this topic in a future post. Today I’d like to focus on the business side of RDS and Aurora.

In the last 4 years I had endless discussions over the use of Galera running in AWS on standard EC2 instances. I tried to explain many times that having Galera in such environment was almost pointless, since administrators did not have real control of the infrastructure. The reasons have nothing to do with the quality and the features of Galera, rather with the use of a technology placed in the wrong layer of the *aaS stack. Last but not least, I tried many times to guide the IT managers through the jungle of hidden costs of an installation of Galera (and other clustering technologies) in EC2, working through VPCs, guaranteed IOPs, dedicated instances and AZs etc.

I had interesting meetings with customers and prospects to help them in the analysis of the ROI of a migration and the TCO of an IT service in a public cloud. One example in particular, a media company in North America, was extremely interesting. The head of IT decided that a web service had to be migrated to AWS. The service had predicable web access peaks, mainly during public events – a perfect fit for AWS. When an event is approaching, Sysadms can launch more instances, then they can close them when the event ends. Unfortunately, the same approach cannot be applied to database servers, as their systems require to keep data available at all times. Each new event requires more block storage with higher IOPs and the size and flavour of the DB instances becomes so high spec that the overall cost of running everything in EC2 is higher than the original installation.

Aurora from an end customer perspective

Why is Aurora a big thing? Here are some points to consider:

1. No hidden costs in public clouds

The examples of Galera and the DB servers in AWS that I mentioned, are only two of the surprises that IT managers will find in their bills. There is a very good reason why public clouds have (or should have) a DBaaS offering: databases should be part of IaaS. They must make the most out of the knowledge of the bare metal layer, in terms of physical location, computing and storage performance, redundancy and reliability etc. Cloud users must use the database confidently, leaving typical administration tasks such as data backups and replication to automated systems that are part of the infrastructure. Furthermore, end customers want to work with databases that do not suffer resource contention in terms of processing, storage and network – or at least not in a way that is perceivable from an application standpoint. As we select EBS disks with requested IOPs, we must be able to use a database server with requested QPSs – whatever we define as “Query”. The same should happen for private clouds, since technologies, benefits and disadvantages are substantially the same. In AWS, RDS has already these features, but Aurora simply promises a better experience with more performance and reliability. Sadly, not many alternatives are available from other cloud providers.

2. Reduce the churn rate

A consequence of the real or expected hidden costs is a relatively high churn rate that affects many IT projects in AWS. DevOps start with AWS because it is simple and available immediately, but as the the business grows, so does the bill, and sometimes the growth is not proportional. Amazon needed to remove the increase in costs for their database as one of the reasons to leave or reduce the use of a public cloud, and Aurora is a significant step in this direction. I expect end customers to be more keen to keep their applications on AWS in the long run.

A strong message to the MySQL Ecosystem

There are lots of presentations and analysis around MySQL and the MySQL flavours, yet none of these analysis looks at the generated revenues from the right perspective. Between 2005 and 2010, MySQL was a hot technology that many considered as a serious alternative to closed source relational databases. In 2014, with an amazing combination of factors:

  • A vast number of options available as open source technologies in the database market
  • A substantial change in the IT infrastructure, focused on virtualisation and cloud operations
  • A substantial change in the development of applications and in the types of applications, now dominated by a DevOps approach
  • A fracture in the MySQL ecosystem, caused by forks and branches that generated competition but also confusion in the market
  • An increasing demand for databases focused on rich media and big data
  • A relatively stable and consolidated MySQL
  • A good level of knowledge and skills available in the market
    (…and the list goes on…)

All these factors have not only limited the growth in revenues in the MySQL ecosystem, but have basically shrunk them – if you do not consider the revenues coming from DBaaS. Here is a pure speculation: Oracle gets a good chunk of their revenues for MySQL from OEMs (i.e. commercial licenses) and from existing not-only-MySQL Oracle customers. Although Percona works hard in producing a more differentiated software product (and kudos for the work that the Percona software team does in terms of tooling and integration), the company adopted a healthy, but clearly services-focused business model. The MariaDB approach is more similar to Oracle, but without commercial licenses and without a multi-billion$ customers base. Yet, when you review the now 18 months’ old keynote from 451 research at Percona Live, you realise that the focus on “Who uses MySQL?” is pretty irrelevant: MySQL is ubiquitous and will be the most used open source database in the upcoming years. The question we should ask is rather, “Who pays for MySQL?”, or even better, “Why should one pay for MySQL?”: a reasonable fee paid for MySQL is the lymph that companies in the MySQL ecosystem need to survive and innovate.

Right now, in the MySQL ecosystem, Amazon is the real winner. Unfortunately, there are no public figures that can prove my point, not from the MySQL vendors, nor from Amazon. DBaaS is at the moment the only way to make users pay for a standard, fully compatible MySQL. In topping up X% of standard EC2 instances, Amazon provides a risk free, ready to use and near-zero administration database – and this is a big thing for DevOps, startup, application-focused teams who need to keep their resource costs down and will never hire super experts in MySQL.

With Aurora, Amazon has significantly raised the bar in the DBaaS competition, where there are no real competitors to Amazon at the moment. Dimitry Kravtchuck wrote in his blog that “MySQL 5.7 is already doing better” than Aurora. I have no doubts that a pure 5.7 on bare metal delivers great performance, but I think there are some aspects to consider. First of all, Aurora target customers do not want to deal with my.cnf parameters – even if we found out that Aurora is in reality nothing more than a smart configurator for MySQL, which can magically adapt mysqld instances on a given workload, it would still be good enough for the end customers. Second and most important point, Aurora is [supposed to be] the combination of standard MySQL (i.e. not an esoteric and innodb-incompatible storage engine) that delivers good performance in AWS – if Amazon found out that they can provide the same cloud features using a new and stable version of MySQL 5.7, I have no doubts they would replace Aurora with a better version of MySQL, probably keeping the same name and ignoring the version number, and even more importantly enjoying the revenues that the new improved version would generate.

The ball is in our court

With Aurora, Amazon is well ahead of any other vendor – public cloud and MySQL-based technology vendors – in providing MySQL as DBaaS. Rest assured that Google, Azure, Rackspace, HPCloud, Joyent and others are not sitting there watching, but in the meantime they are behind Aurora. Some interesting projects that can fill the gap are going on at the moment. Tesora is probably the most active company in this area, focusing on OpenStack as the platform for public and private clouds. Continuent provides a technology that could be adapted for this use too and the recent acquisition from VMware may give a push to some projects in this area. Considering the absence of the traditional MySQL players in this area, on one side this is an opportunity for new players to invest in products that are very likely to generate revenues in the near future. On the other hand, it is a concern that the lack of options for DBaaS will convince more end customers to adopt NoSQL technologies, which are more suitable to work in distributed, cloud infrastructures.