Friday, January 15, 2010

RE: BitSource - Rackspace Cloud Servers versus Amazon EC2: Performance Analysis

The BitSource was recently hired by Encoding.com to conduct a performance comparison between EC2 and Rackspace Cloud. The full details of their analysis are available here. Here is a summary of their findings:

On CPU Performance
"On average, Cloud Servers was more than twice as fast as Amazon EC2 at compiling the Linux kernel across all instance sizes."

On Disk I/O
"Disk I/O results show that Cloud Servers consistently have much better write and random write performance than EC2 across most sizes."

They used a combination of IOZone and Linux kernel compiling to conduct their analysis. Here our opinion on this (also commented at the bottom of their article):


Our Comments
We've also compared EC2 and Rackspace performance using geekbench, specjvm, hdparm, mysqlbench and unixbench with one example result set:

Rackspace 4GB Instance:
Geekbench: 2841
hdparm buffered disk reads: 165 MB/sec
SPECjvm Composite result: 51.99
mysqlbench: 1330 wallclock seconds
unixbench: 777

EC2 m1.large instance (w/ EBS local storage):
Geekbench: 3113
hdparm buffered disk reads: 65 MB/sec
SPECjvm Composite result: 36.51
mysqlbench: 1327 wallclock seconds
unixbench: 663

The disk I/O results are a bit better with rackspace based on low level measurements, but at a higher application level like mysqlbench they are very similar. CPU/memory IO performance is also a mixed bag with the EC2 instance performing better with geekbench and Rackspace performing better with unixbench and specjvm.

However, I think it is a bit of a stretch to put Rackspace cloud on the same playing field as EC2. EC2 is a much more mature platform with many many more features like multiple data centers, instance independent storage (ebs), auto scaling and monitoring, load balancing, vpc, and more. Rackspace cloud is basically just VPS with on-demand pricing. Of course they provide free support and an excellent CDN offering based on Limelight (much better than cloudfront), but their IaaS offering leaves a lot to be desired.

Also, on pricing, EC2 is much better on the high end, particular with reserve instances (i.e. an 8GB m1.large reserve instance is $0.17/hr over a 3 year period while an 8GB rackspace instance is 3x as expensive at $0.48/hr). We've also done some public bandwidth testing and all 3 EC2 regions provided generally faster downlink throughput than the Rackspace Cloud Dallas data center.

I think Rackspace is off to a good start with their Slicehost acquired cloud. My hat really goes off to their marketing team too.

Here are links to the geekbench tests we ran:

EC2 m1.large - score: 3113

4GB Rackspace Cloud Server - score 2841

Sunday, January 10, 2010

Cloud Speed Test


We are working on a cloud speed test. This speedtest functions by using flash to download a 1MB test file and upload 1/2 MB to and from various cloud services. It then measures the amount of time it takes to complete those operations and provides a transfer rate for each.

What we are hoping to accomplish with this is to both allow users the measure their bandwidth performance against various cloud services, and also to aggregate this data in order to provide a more accurate analysis of the bandwidth performance those cloud services provide. In the previous blog post we summarized bandwidth performance where we used pingdom over a period of a couple months as a means of measurement. The speedtest will provide us with a much larger and more diverse test population. The end result we are looking for is to be able to allow users to view overall, time-based and geographically targeted bandwidth performance measurements for public clouds and services.

The speedtest is still very much in beta form. It does not currently allow you to filter the cloud services you'd like to test. However, it is mostly functional and we'd appreciate anyone trying it out and providing feedback. If you allow the test to run all the way through, it will download about 40 MB and upload 10 MB to cloud services and provide you with transfer rates for each based on your Internet connection.


Saturday, January 9, 2010

Bandwidth Disparity in the Cloud

Over the past 2 months we've been using Pingdom to monitor downlink throughput for various cloud services. We placed a 5 MB test file in each of these cloud services and then configured Pingdom to pull that file every 15 minutes. Pingdom maintains 15 globally placed servers including 1 US West (Los Angeles), 2 US East (NY, VA), 1 US Central (Chicago), 5 US South and SW (Dallas, Houston, Atlanta), 1 Canada (Montreal), and 4 Europe (London, Stockholm, Frankfurt, Amsterdam). Every 15 minutes it pulls that test file from each of the public clouds using a round-robin monitor selection. Although, I agree this is not the best method for calculating downlink throughput, it was affordable and easy to setup and I believe it can at least provide some comparative value at a high level.

The results of these tests have show a good amount of bandwidth disparity between the different cloud providers.

Cloud Servers (IaaS):
The greatest disparity in these tests was between cloud server providers. We setup the smallest instance possible with each of the public clouds (i.e. EC2's m1.small, Linode 360, etc.), running Linux and Apache with a 5MB static test file. The results show average downlink speeds ranging from 4.45 Mb/s with Terremark's vCloud service to 23.42 Mb/s with Amazon EC2 US East (which is coincidentally higher than 3 of the CDNs we also tested).

Service/Rate (Mb/s)


Content Delivery Networks (CDN):
If you happen to not know what a CDN is, it is basically a service that allows you to delivery Internet content faster to your users that you could hosting it on your own servers. CDNs enable this by maintaining many "edge" servers located throughout the world, each with a a copy of your files. They then use some DNS magic such that when a user in the US requests a file it is pulled from a US server while a user in Europe gets it from a Europe-based server. The end result is that the user gets the file from the closest and/or fastest server to them. This also offloads a lot of static content bandwidth from your servers.

To test these CDNs, we setup accounts with each uploaded a 5MB test file to them for testing. The only exception to this is CacheFly which provides a 5MB test file for free and Akamai which we were able to find a 5MB test file on (thanks DNC).

The big surprise for us here was that Akamai, the "big dog" amongst CDNs was not the top performer. Other than that the remaining 3 in the top 4 we as we had expected, Akamai, Limelight and EdgeCast.

Service/Rate (Mb/s)


Cloud Platforms and Services (PaaS/SaaS):
Platforms and SaaS offer a higher level of cloud computing capabilities. Force.com for example, allows you to create databases and forms very quickly with their web-based management platform and Google Site's allows you to create a website within a few minutes. If your business software requirements can fit they "mold" of a PaaS or SaaS, they are oftentimes a great choice to facilitate a very quick time-to-market.

We setup accounts with each of these providers and posted a 5MB test file in each for testing. Force.com basically maintains it's own in-house CDN for static content delivery which explains it's significantly better performance.

Service/Rate (Mb/s)

Next Steps
We recognize that the biggest problem with this analysis is the limited scope of testing by only using Pingdom. We are developing a "Cloud Speedtest" that will pay to have run by hundreds of users with normal residential and commercial Internet connections. This speedtest will record both up and downlink throughput to various cloud services. We will aggregate that data with Pingdom, our own testing, and some other sources to produce a much better analysis of many cloud services.

Cloud Monitoring

Over the past few months we've been monitoring uptime and bandwidth for various cloud services including CDNs (i.e. Akamai, Edgecast), Platforms (i.e. AppEngine, Azure), Servers (i.e. EC2, GoGrid), and Storage (i.e. S3, Nirvanix). To do so, we created accounts with these cloud providers and setup files and servers to be monitored on each. Then we used both Pingdom and Panopta to monitor them. Pingdom maintains 15 globally placed servers which we use to pull a 5MB test file every 15 minutes in order to measure downlink throughput. We use both Panopta and Pingdom to monitor uptime. Panopta does a better job than Pingdom at monitoring by verifying outages with 3 different monitors before triggering an outage. This has resulted in fewer false positives than we experience with Pingdom. We also manually verify most downtime periods with the service provider and exclude maintenance periods. We have the results of these ongoing tests posted here:

Uptime % for each cloud service:

Pingdom monitoring public report (based on a 5MB test file):