Friday, June 11, 2010

Google Storage, a CDN/Storage Hybrid?

We received our Google Storage for Developers account today, and being the curious types, immediately ran some network tests. We maintain a network of about 40 cloud servers around the world which we use to monitor cloud servers, test cloud-to-cloud network throughput, and our cloud speedtest. We used a handful of these to test Google Storage uplink and downlink throughput and latency. We were very surprised by the low latency and consistently fast downlink throughput to most locations.

Most storage services like Amazon's S3 and Azure Blob Storage are physically hosted from a single geographical location. S3 for example, is divided into 4 regions, US West, US East, EU West and APAC. If you store a file to an S3 US West bucket, it is not replicated to the other regions, and can only be downloaded from that regions' servers. The result is much slower network performance from locations with poor connectivity to that region. Hence, you need to chose your S3 region wisely based on geographical and/or network proximity. Azure's Blob storage uses a similar approach. Users are able to add CDN on top of those services (CloudFront and Azure CDN), but the CDN does not provide the same access control and consistency features of the storage service.

In contrast, Google's new Storage for Developers service appears to store uploaded files to a globally distributed network of servers. When a file is requested, Google uses some DNS magic to direct a user to a server that will provide the fastest access to that file. This is very similar to the approach that Content Delivery Networks like Akamai and Edgecast work, wherein files are distributed to multiple globally placed PoPs (point-of-presence).

Our simple test consisted of requesting a test 10MB file from 17 different servers located in the US, EU and APAC. The test file was set to public and we used wget to test downlink and Google's gsutil to test uplink throughput (wget was faster than gsutil for downloads). In doing so, we found the same test URL resolved to 11 different Google servers with an average downlink of about 40 Mb/s! This hybrid model of CDN-like performance with enterprise storage features like durability, consistency and access control represents an exciting leap forward for cloud storage!

Google Storage Network Performance Tests
Service Location Resolved To Latency Upload (Mb/s) Download (Mb/s)
Gigenet IL, US 209.85.225.132 15.05 15.13 68.4
CloudSigma Switzerland 209.85.227.132 24.25 1.75 21.5
Linode (London) UK 209.85.229.132 7.13 5.72 41.6
Bluelock IN, US 64.233.169.132 23.1 4.42 56.8
EC2 (EU West) Ireland 66.102.9.132 2.19 6.58 29.5
CloudCentral Australia 66.102.11.132 5.96 4.84 8.2
Rimu Hosting New Zealand 66.102.11.132 27.05 5.53 9.64
Gandi Cloud VPS France 66.102.13.132 19.9 2.81 31.1
Zerigo CO, US 72.14.203.132 57.05 5.63 24.04
Voxel (Amsterdam) The Netherlands 72.14.204.132 83.2 3.86 40.6
EC2 (US East) VA, US 72.14.204.132 2.72 10.14 39.4
EC2 (US West) VA, US 72.14.204.132 27.7 8.77 9.5
Speedyrails Canada 72.14.204.132 24.5 8.28 33.1
VPS.net (UK) UK 74.125.79.132 11.6 3.48 32.1
Rackspace (Chicago) IL, US 74.125.95.132 13.05 4.23 70.3
GoGrid CA, US 74.125.127.132 24.2 6.32 39.9
Zerigo CO, US 74.125.155.132 55.3 10.51 42.08

3 comments:

  1. I always enjoy learning how other people employ Google storage. I am wondering if you can check out my very own tool CloudBerry Explorer that helps to manage Google Storage on Windows . It is a freeware.

    ReplyDelete
  2. I would love to use google storage as a CDN for my wordpress sites but don't know how to do this - anyone willing to help on the google side and I can make a wordpress plugin.

    ReplyDelete
  3. Like you said, yeah, Google uses some kind of DNS-magic that almost always hides the actual IP address of the server delivering the object(s) and instead shows their Mountain View, California address.

    It would be really interesting to know how they do that though. I would be all eyes for it. =)

    ReplyDelete