It does not matter if Aurora performs 1x or 10x MySQL: it _is_ a big thing

I spent the last 4 years at SkySQL/MariaDB working on versions of MySQL that could be “suitable for the cloud”. I strongly believed that the world needed a version of MySQL that could work in the cloud even better than its comparable version on bare metal. Users and administrators wanted to benefit from the use of cloud infrastructures and at the same time they wanted to achieve the same performance and overall stability of their installations on bare metal. Unfortunately, ACID-compliant databases in the cloud suffer from the issues that any centrally controlled and strictly persistent system can get when hosted on highly distributed and natively stateless infrastructures.

In this post I am not going to talk about the improvements needed for MySQL in the cloud – I will tackle this topic in a future post. Today I’d like to focus on the business side of RDS and Aurora.

In the last 4 years I had endless discussions over the use of Galera running in AWS on standard EC2 instances. I tried to explain many times that having Galera in such environment was almost pointless, since administrators did not have real control of the infrastructure. The reasons have nothing to do with the quality and the features of Galera, rather with the use of a technology placed in the wrong layer of the *aaS stack. Last but not least, I tried many times to guide the IT managers through the jungle of hidden costs of an installation of Galera (and other clustering technologies) in EC2, working through VPCs, guaranteed IOPs, dedicated instances and AZs etc.

I had interesting meetings with customers and prospects to help them in the analysis of the ROI of a migration and the TCO of an IT service in a public cloud. One example in particular, a media company in North America, was extremely interesting. The head of IT decided that a web service had to be migrated to AWS. The service had predicable web access peaks, mainly during public events – a perfect fit for AWS. When an event is approaching, Sysadms can launch more instances, then they can close them when the event ends. Unfortunately, the same approach cannot be applied to database servers, as their systems require to keep data available at all times. Each new event requires more block storage with higher IOPs and the size and flavour of the DB instances becomes so high spec that the overall cost of running everything in EC2 is higher than the original installation.

Aurora from an end customer perspective

Why is Aurora a big thing? Here are some points to consider:

1. No hidden costs in public clouds

The examples of Galera and the DB servers in AWS that I mentioned, are only two of the surprises that IT managers will find in their bills. There is a very good reason why public clouds have (or should have) a DBaaS offering: databases should be part of IaaS. They must make the most out of the knowledge of the bare metal layer, in terms of physical location, computing and storage performance, redundancy and reliability etc. Cloud users must use the database confidently, leaving typical administration tasks such as data backups and replication to automated systems that are part of the infrastructure. Furthermore, end customers want to work with databases that do not suffer resource contention in terms of processing, storage and network – or at least not in a way that is perceivable from an application standpoint. As we select EBS disks with requested IOPs, we must be able to use a database server with requested QPSs – whatever we define as “Query”. The same should happen for private clouds, since technologies, benefits and disadvantages are substantially the same. In AWS, RDS has already these features, but Aurora simply promises a better experience with more performance and reliability. Sadly, not many alternatives are available from other cloud providers.

2. Reduce the churn rate

A consequence of the real or expected hidden costs is a relatively high churn rate that affects many IT projects in AWS. DevOps start with AWS because it is simple and available immediately, but as the the business grows, so does the bill, and sometimes the growth is not proportional. Amazon needed to remove the increase in costs for their database as one of the reasons to leave or reduce the use of a public cloud, and Aurora is a significant step in this direction. I expect end customers to be more keen to keep their applications on AWS in the long run.

A strong message to the MySQL Ecosystem

There are lots of presentations and analysis around MySQL and the MySQL flavours, yet none of these analysis looks at the generated revenues from the right perspective. Between 2005 and 2010, MySQL was a hot technology that many considered as a serious alternative to closed source relational databases. In 2014, with an amazing combination of factors:

  • A vast number of options available as open source technologies in the database market
  • A substantial change in the IT infrastructure, focused on virtualisation and cloud operations
  • A substantial change in the development of applications and in the types of applications, now dominated by a DevOps approach
  • A fracture in the MySQL ecosystem, caused by forks and branches that generated competition but also confusion in the market
  • An increasing demand for databases focused on rich media and big data
  • A relatively stable and consolidated MySQL
  • A good level of knowledge and skills available in the market
    (…and the list goes on…)

All these factors have not only limited the growth in revenues in the MySQL ecosystem, but have basically shrunk them – if you do not consider the revenues coming from DBaaS. Here is a pure speculation: Oracle gets a good chunk of their revenues for MySQL from OEMs (i.e. commercial licenses) and from existing not-only-MySQL Oracle customers. Although Percona works hard in producing a more differentiated software product (and kudos for the work that the Percona software team does in terms of tooling and integration), the company adopted a healthy, but clearly services-focused business model. The MariaDB approach is more similar to Oracle, but without commercial licenses and without a multi-billion$ customers base. Yet, when you review the now 18 months’ old keynote from 451 research at Percona Live, you realise that the focus on “Who uses MySQL?” is pretty irrelevant: MySQL is ubiquitous and will be the most used open source database in the upcoming years. The question we should ask is rather, “Who pays for MySQL?”, or even better, “Why should one pay for MySQL?”: a reasonable fee paid for MySQL is the lymph that companies in the MySQL ecosystem need to survive and innovate.

Right now, in the MySQL ecosystem, Amazon is the real winner. Unfortunately, there are no public figures that can prove my point, not from the MySQL vendors, nor from Amazon. DBaaS is at the moment the only way to make users pay for a standard, fully compatible MySQL. In topping up X% of standard EC2 instances, Amazon provides a risk free, ready to use and near-zero administration database – and this is a big thing for DevOps, startup, application-focused teams who need to keep their resource costs down and will never hire super experts in MySQL.

With Aurora, Amazon has significantly raised the bar in the DBaaS competition, where there are no real competitors to Amazon at the moment. Dimitry Kravtchuck wrote in his blog that “MySQL 5.7 is already doing better” than Aurora. I have no doubts that a pure 5.7 on bare metal delivers great performance, but I think there are some aspects to consider. First of all, Aurora target customers do not want to deal with my.cnf parameters – even if we found out that Aurora is in reality nothing more than a smart configurator for MySQL, which can magically adapt mysqld instances on a given workload, it would still be good enough for the end customers. Second and most important point, Aurora is [supposed to be] the combination of standard MySQL (i.e. not an esoteric and innodb-incompatible storage engine) that delivers good performance in AWS – if Amazon found out that they can provide the same cloud features using a new and stable version of MySQL 5.7, I have no doubts they would replace Aurora with a better version of MySQL, probably keeping the same name and ignoring the version number, and even more importantly enjoying the revenues that the new improved version would generate.

The ball is in our court

With Aurora, Amazon is well ahead of any other vendor – public cloud and MySQL-based technology vendors – in providing MySQL as DBaaS. Rest assured that Google, Azure, Rackspace, HPCloud, Joyent and others are not sitting there watching, but in the meantime they are behind Aurora. Some interesting projects that can fill the gap are going on at the moment. Tesora is probably the most active company in this area, focusing on OpenStack as the platform for public and private clouds. Continuent provides a technology that could be adapted for this use too and the recent acquisition from VMware may give a push to some projects in this area. Considering the absence of the traditional MySQL players in this area, on one side this is an opportunity for new players to invest in products that are very likely to generate revenues in the near future. On the other hand, it is a concern that the lack of options for DBaaS will convince more end customers to adopt NoSQL technologies, which are more suitable to work in distributed, cloud infrastructures.