Thursday, April 03, 2014

Public Cloud Instance Pricing Wars - Detailed Context and Analysis

As part of my opening keynote at Cloud Connect in Las Vegas I summarized the latest moves in cloud, the slides are available via the new Powered by Battery site as "The Good the Bad and the Ugly: Critical Decisions for the Cloud Enabled Enterprise". This blog post is a detailed analysis of just part of what happened.

Summary points
  • AWS users should migrate from obsolete m1, m2, c1, c2 to the new m3, r3, c3 instances to get better performance at lower prices with the latest Intel CPUs.
  • Any cloud benchmark or cost comparison that uses the AWS m1 family as a basis should be called out as bogus benchmarketing.
  • AWS and Google instance prices are essentially the same for similar specs.
  • Microsoft doesn’t appear to have the latest Intel CPUs generally available and only matches prices for obsolete AWS instances.
  • IBM Softlayer pricing is still higher, especially on small instance types
  • Google's statement that prices should follow Moore’s law implies that we should expect prices to halve every 18-24 months
  • Pricing pages by AWS, Google Compute Engine, Microsoft Azure, IBM Softlayer
  • Adrian’s spreadsheet summary of instances from the above vendors at http://bit.ly/cloudinstances
  • Analysis of the prices by Rightscale

On Tuesday 25th March 2014 Google announced some new features and steep price cuts, the next day Amazon Web Services also announced new features and matching price cuts. On Monday 31st March Microsoft Azure also reduced prices. Many pundits repeated talking points from press releases in blog posts but unfortunately there was little attempt to understand what really happened, and explain the context and outcome. When I wrote up a summary for my opening keynote at Cloud Connect on 31st March I looked at the actual end result and came up with a different perspective and a list of gaps.

I’m only going to discuss instance types and on-demand prices here. There was a lot more in the announcements that other people have done a good job of summarizing. The Rightscale blog linked above also gives an accurate and broader view on what was announced. I will discuss other pricing models beyond on-demand in future blog posts.

There are some things you need to know to get the right background context for the instance price cuts. The most important is to understand that AWS has two generations of instance types, and is in a transition from Intel CPU technology they introduced five or more years ago to a new generation introduced in the last year. The new generation CPUs are based on an architecture known as Sandybridge. The latest tweak is called Ivybridge and has incremental improvements that give more cores per chip and slightly higher performance. Since Google is a recent entrant to the public cloud market, all their instances types are based on Sandybridge. To correctly compare AWS prices and features with Google, there is a like-for-like comparison that can be made. AWS is encouraging the transition by pricing its newer faster instances at a lower cost than the older slower ones. In the recent announcement, AWS cut the prices by obsolete instance type families by a smaller percentage than the newer instance type families, so the gap has just widened.

Old AWS instance types have names starting with m1, m2 and c1, c2. They all have newer replacements known as m3, r3 and c3 except the smallest one – the m1.small. The newer instances have a similar amount of RAM and CPU threads, but the CPU performance is significantly higher. The new equivalents also replace small slow local disks with smaller but far faster and more reliable solid-state disks, and the underlying networks move from 1Gbit/s to 10Gbit/s. The newer instance families should also have lower failure rates.

Most people are much more familiar with the old generation instance types, and competitors write their press releases they are able to get away with claiming that they are both faster and cheaper than AWS, by comparing against the old generation products. This is an old “benchmarketing” trick – compare your new product against the competitions older and more recognizable product.

For the most commonly used instance types there is a close specification match between the AWS m3 and the Google n1-standard. They are also exactly the same price per hour. Since AWS released its changes after Google, this implies that AWS deliberately matched Google’s price. The big architectural difference between the vendors is that Google instances are diskless, all their storage is network attached, while AWS have various amounts of SSD included. The AWS hypervisor also makes slightly more memory available per instance, and ratings for the c3 imply that AWS is supplying a slightly higher CPU clock rate for that instance type. I think that this is because AWS has based its compute intensive c3 instance types on a higher clock rate Ivybridge CPU rather than the earlier Sandybridge specification. For the high memory capacity instance types it is a little different. The Google n1-himem instances have less memory available than the AWS r3 equivalents, and cost a bit less. This makes intuitive sense as this instance type is normally bought for its memory capacity.

Microsoft previously committed to match AWS prices, and in their announcement their comparisons matched the m1 range exactly at it’s new price, and they compared their memory oriented A5 instance as cheaper than an old m2.xlarge, but the A5 is an older slower CPU type, more expensive ($0.22 vs $0.18) and has less memory (14GB vs. 15GB) than the AWS r3.large. The common CPU options on Azure are aligned with the older AWS instance types. Azure does have Intel Sandybridge CPUs for compute use cases as the A8 and A9 models, but I couldn't find list pricing for them and they appear to be a low volume special option. The Azure pricing strategy ignores the current generation AWS product, so the price match guarantee doesn’t deliver. In addition the Google and AWS price changes were effective from April 1st, but Azure takes effect May 1st.

IBM Softlayer has a choose-what-you-want model rather than a specific set of instance types. The smaller instances are $0.10/hr where AWS and Google n1-standard-1 are $0.07/hr. As you pick a bigger instance type on Softlayer the cost doesn’t scale up linearly, while Google and AWS double the price each time the configuration doubles. The Softlayer equivalent of the n1-standard-16 is actually slightly lower cost than Google. Softlayer pricing on most instances is in the same ballpark as AWS and Azure were before the cuts, so I expect they will eventually have to cut prices to match the new level.

Gaps and Missing Features

The remaining anomaly in AWS pricing is the low-end m1.small. There is no newer technology equivalent at present, so I wouldn’t be surprised to see AWS do something interesting in this space soon. Generally AWS has a much wider range of instances than Google, but AWS is missing an m3.4xlarge to match Google's n1-standard-16, and the Google hicpu range has double the CPU to RAM ratio of the AWS c3 range so they aren’t directly comparable.

Google has no equivalent to the highest memory and CPU AWS instances, and has no local disk or SSD options. Instead they have better attached disk performance than AWS Elastic Block Store, but attached disk adds to the instance cost, and can never be as fast as local SSD inside the instance.

Microsoft Azure needs to refresh its instance type options, it has a much smaller range, older slower CPUs, and no SSD options. It doesn’t look particularly competitive.

Conclusion

If you buy hardware and capitalize it over three years, and later on there is a price cut; you don’t get to reduce your monthly costs. Towards the end your CPUs are getting old, leading to less competitive response times and higher failure rates. With public cloud vendors driving the costs down several times a year and upgrading their instances, your model of public vs. private costs needs to factor in something like Moore’s law for cost reductions and a technology refresh more often than every three years. Google actually said we should expect Moore’s law to apply in their announcement, which I interpret to mean that we can expect costs to halve about every 18-24 months. This isn’t a race to zero; it’s a proportional reduction every year. Over a three-year period the cost at the end is a third to a quarter of the cost at the start.

I still hear CIOs worry that cloud vendor lock-in would let them raise prices. This ruse is used to justify private cloud investments. Even without switching vendors, you will see repeated price reductions for the public cloud systems you are already using. This was the 42nd price cut for AWS, the argument is ridiculous.


I’ve previously published presentation materials on costoptimization with AWS. I’m researching this area and over the coming months will publish a series of posts on all aspects of cloud optimization.