Sunday, October 3, 2010

Cloudscaling & KT: Private cloud validation using benchmarking

A few months ago we were contacted by Cloudscaling CEO Randy Bias regarding our work in benchmarking of public IaaS clouds (see previous blog posts). His team was working on a large private cloud deployment for KT, Korea's largest landline and second largest mobile carrier, and was interested in using similar techniques to validate that private cloud. This validation would include not only raw benchmark results, but also comparisons of how the private cloud stacked up against existing public clouds such as EC2 and GoGrid. This data would be useful not only for Cloudscaling to validate their own work, but also as a reference for their client KT. We agreed to the project and benchmarking was conducted over a 4 day period last August. Our deliverables included raw benchmark data and an executive report highlighting the results. In this post we will provide the results of these benchmarks.

Multi-Tenancy and Load Simulation
Benchmarking of public IaaS clouds involves a certain amount of ambiguity due to the scheduling and allocation of resources in multi-tenant virtualized environments. One of the fundamental jobs of a hypervisors such as VMware and Xen is to allocate shared resources in a fair and consistent manner. In order to maximize performance and utilization, they are designed to allocate resources such as CPU and Disk IO using a combination of fixed and burstable methods. For example, when a VM requests CPU resources, the hypervisor will generally provide more resources when neighboring VMs are idle versus when they are also requesting CPU resources. In very busy environments, this often results in variable and inconsistent VM performance.

Because the KT cloud benchmarking was conducted pre-launch, there was no other load in the environment besides our benchmarking. To offset this, we ran the benchmarks twice. In the first run, the benchmarks were run individually to provide maximum performance. In the second run, we attempted to simulate a loaded environment by filling the cloud to about 70% capacity with VMs instructed to perform a random sample of load simulating benchmarks (using mostly non-synthetic benchmarks like tpcc, blogbench and pgbench). The benchmarks for the second run were conducted concurrently with the load simulation. The tables and graphs below provide the unloaded benchmark results. Differences between those and the loaded results are noted above the results.

Organization of Results
The results below are separated into 2 general VM types, a large (16 & 32GB) VM and small (2GB) VM. Comparative data is also shown from public clouds including
BlueLock, GoGrid, Amazon EC2, Terremark vCloud Express and Rackspace Cloud. We conducted similar benchmarking in these public clouds earlier this year. The results provided are based on 5 aggregate performance metrics we created and discussed in previous blog posts including:
Note on CPU Stats
The servers tested in all of these benchmarks run within virtualized environments. The cores shown in the benchmark tables below are the # of cores or vCPUs exposed to the virtual server by the hypervisor. This is often not the same as the # of physical cores available on the host system.

Benchmark Results

CPU Performance
CPU benchmark results during the loaded versus unloaded benchmark runs were roughly the equivalent.

Large Server

Cloud Server CPU Memory CCUs
BlueLock 16gb-8cpu Xeon X5550 2.67 GHz [8 cores] 16 GB 29.2
KT 32GB/6x2GHz Xeon L5640 2.27 GHz [6 cores] 32 GB 28.66
GoGrid 8gb Xeon E5450 2.99 GHz [6 cores] 8 GB 27.36
Amazon EC2 m2.2xlarge Xeon X5550 2.67 GHz [4 cores] 34.2 GB 25.81
KT 16GB/3x2GHz Xeon L5640 2.27 GHz [3 cores] 16 GB 15.27
Terremark 16gb-8vpu AMD 8389 2.91 GHz [8 cores] 16 GB 9.81
Amazon EC2 m2.xlarge Xeon X5550 2.67 GHz [2 cores] 17.1 GB 9.1
Rackspace Cloud 16gb AMD 2374 2.20 GHz [4 cores] 16 GB 5.1

CPU Performance
Small Server

Cloud Server CPU Memory CCUs
BlueLock 2gb Xeon X5550 2.67 GHz [1 core] 2 GB 6.37
KT 2GB/1x2GHz Xeon Xeon L5640 2.27 GHz [1 core] 2 GB 5.98
Terremark 2gb AMD 8389 2.91 GHz [1 core] 2 GB 5.57
Rackspace Cloud 2gb AMD 2374 2.20 GHz [4 cores] 2 GB 5.08
GoGrid 2gb Xeon E5520 2.27 GHz [2 cores] 2 GB 5.02
Amazon EC2 c1.medium Xeon E5410 2.33 GHz [2 cores] 1.7 GB 3.49


Disk IO Performance
Disk IO performance was 20-30% slower than shown below during the loaded benchmark run. The KT cloud uses external SAN storage for VM instance storage. This, combined with the fact that the load simulation benchmarks were fairly disk IO intensive (probably more so than an actual production environment), are likely the reason this. Despite this, disk IO performance was very good. It should be noted that GoGrid and Rackspace Cloud do not utilize external VM instance storage.

Large Server

Cloud Server CPU Memory IOP
KT 16GB/3x2GHz Xeon L5640 2.27 GHz [3 cores] 16 GB 127.05
KT 32GB/6x2GHz Xeon L5640 2.27 GHz [6 cores] 32 GB 125.31
GoGrid 8gb Xeon E5450 2.99 GHz [6 cores] 8 GB 122.62
Terremark 16gb-8vpu AMD 8389 2.91 GHz [8 cores] 16 GB 112.59
Rackspace 16gb AMD 2374 HE 2.20 GHz [4 cores] 16 GB 100.15
Amazon EC2 m2.2xlarge Xeon X5550 2.67 GHz [4 cores] 34.2 96.22
Amazon EC2 m2.xlarge Xeon X5550 2.67 GHz [2 cores] 17.1 87.87

Disk IO Performance
Small Server

Cloud Server CPU Memory IOP
GoGrid 2gb Xeon E5520 2.27 GHz [2 cores] 2 GB 143.35
KT 2gb Xeon L5640 2.27 GHz [1 core] 2 GB 133.08
Terremark 2gb AMD 8389 2.91 GHz [1 core] 2 GB 96.9
Rackspace 2gb AMD 2374 HE 2.20 GHz [4 cores] 2 GB 62.46
BlueLock 2gb Xeon X5550 2.67 GHz [1 core] 2 GB 49
Amazon EC2 c1.medium Xeon E5410 2.33 GHz [2 cores] 1.7 39.69


Programming Language Performance
Benchmark performance in this category was about 10-15% slower during the loaded benchmark run that what is shown below.

Large Server

Cloud Server CPU Memory Score
KT 32GB/6x2GHz Xeon L5640 2.27 GHz [6 cores] 32 GB 123.43
GoGrid 8gb Xeon E5450 2.99 GHz [6 cores] 8 GB 122.22
Amazon EC2 m2.2xlarge Xeon X5550 2.67 GHz [4 cores] 34.2 GB 115.45
BlueLock 16gb-8cpu Xeon X5550 2.67 GHz [8 cores] 16 GB 115.41
KT 16GB/3x2GHz Xeon L5640 2.27 GHz [3 cores] 16 GB 108.45
Terremark 16gb-8vpu AMD 8389 2.91 GHz [8 cores] 16 GB 106.9
Amazon EC2 m2.xlarge Xeon X5550 2.67 GHz [2 cores] 17.1 GB 102.27
Rackspace 16gb AMD 2374 2.20 GHz [4 cores] 16 GB 78.66

Programming Language Performance
Small Server

Cloud Server CPU Memory Score
BlueLock 2gb Xeon X5550 2.67 GHz [1 core] 2 GB 101.31
KT 2GB/1x2GHz Xeon L5640 2.27 GHz [1 core] 2 GB 95.72
Terremark 2gb AMD 8389 2.91 GHz [1 core] 2 GB 94.82
GoGrid 2gb Xeon E5520 2.27 GHz [2 cores] 2 GB 80.82
Rackspace 2gb AMD 2374 2.20 GHz [4 cores] 2 GB 73.71


Memory IO Performance
There was no notable different in memory IO benchmark performance between the loaded and unloaded runs.

Large Server

Cloud Server CPU Memory MIOP
BlueLock 16gb-8cpu Xeon X5550 2.67 GHz [8 cores] 16 GB 117.88
KT 32gb Xeon L5640 2.27 GHz [6 cores] 32 GB 114.48
Amazon EC2 m2.2xlarge Intel Xeon X5550 2.67 GHz [4 processors, 4 cores] 34.2 GB 113.04
KT 16gb Xeon L5640 2.27 GHz [3 cores] 16 GB 108.55
Amazon EC2 m2.xlarge Xeon X5550 2.67 GHz [2 cores] 17.1 GB 102.18
GoGrid 8gb Xeon E5450 2.99 GHz [6 cores] 8 GB 88.25
Rackspace 16gb AMD 2374 2.20 GHz [4 cores] 16 GB 70.09
Terremark 16gb-8vpu AMD 8389 2.91 GHz [8 cores] 16 GB 64.74


Memory IO Performance
Small Server

Cloud Server CPU Memory MIOP
BlueLock 2gb Xeon X5550 2.67 GHz [1 core] 2 GB 103.73
KT 2gb Xeon L5640 2.27 GHz [1 core] 2 GB 99.29
GoGrid 2gb Xeon E5520 2.27 GHz [2 cores] 2 GB 83.74
Terremark 2gb AMD 8389 2.91 GHz [1 core] 2 GB 66.06
Rackspace 2gb AMD 2374 2.20 GHz [4 cores] 2 GB 63.04


Encoding & Encryption Performance
Benchmark performance in this category was about 5-10% slower during the loaded benchmark run that what is shown below.

Large Server

Cloud Server CPU Memory Score
GoGrid 8gb Xeon E5450 2.99 GHz [6 cores] 8 GB 146.51
KT 16gb Xeon L5640 2.27 GHz [3 cores] 16 GB 139.25
KT 32gb Xeon L5640 2.27 GHz [6 cores] 32 GB 139.02
Amazon EC2 m2.2xlarge Xeon X5550 2.67 GHz [4 cores] 34.2 GB 136.32
Amazon EC2 m2.xlarge Xeon X5550 2.67 GHz [2 cores] 17.1 GB 135.81
BlueLock 16gb-8cpu Xeon X5550 2.67 GHz [8 cores] 16 GB 130.11
Rackspace 16gb AMD 2374 2.20 GHz [4 cores] 16 GB 111.2
Terremark 16gb-8vpu AMD 8389 2.91 GHz [8 cores] 16 GB 95.25


Encoding & Encryption Performance
Small Server

Cloud Server CPU Memory Score
KT 2gb Xeon L5640 2.27 GHz [1 core] 2 GB 137.21
Terremark 2gb AMD 8389 2.91 GHz [1 core] 2 GB 131.27
BlueLock 2gb Xeon X5550 2.67 GHz [1 core] 2 GB 119.57
Rackspace 2gb AMD 2374 2.20 GHz [4 cores] 2 GB 108.98
GoGrid 2gb Xeon E5520 2.27 GHz [2 cores] 2 GB 103.78
Amazon EC2 c1.medium Xeon E5410 2.33 GHz [2 cores] 1.7 GB 101.56

Conclusion
Overall the KT cloud performed very well relative to other public IaaS clouds. In particular, disk IO performance was exceptional considering Cloudscaling's use of external storage. By using external storage versus local storage, the KT cloud offers higher fault tolerance because VMs can be quickly or even automatically migrated to another host should the host they are running on fail. This feature is often referred to as high availability. Use of Intel Westmere L5640 processors also helped to provide very good CPU and memory IO performance. VM sizing also showed good linear performance increase from smaller to larger sized instances.

3 comments:

  1. In Aug 2010, GoGrid began offering a 16GB cloud server. When we performed our GoGrid benchmarks this option was not available (hence use of the 8GB GoGrid cloud server for comparison). Additionally, GoGrid is in the process of upgrading to Westmere processors. The GoGrid results here are based on the previous (mostly Nehalem X5520) cloud servers.

    ReplyDelete
  2. Great objective report! We're very proud that NexentaStor is the storage portion of this larger solution developed by KT and Cloudscaling.

    ReplyDelete
  3. Congratuations to Cloudscaling and Nexenta on the excellent results.

    ReplyDelete