Saturday, January 9, 2010

Cloud Monitoring

Over the past few months we've been monitoring uptime and bandwidth for various cloud services including CDNs (i.e. Akamai, Edgecast), Platforms (i.e. AppEngine, Azure), Servers (i.e. EC2, GoGrid), and Storage (i.e. S3, Nirvanix). To do so, we created accounts with these cloud providers and setup files and servers to be monitored on each. Then we used both Pingdom and Panopta to monitor them. Pingdom maintains 15 globally placed servers which we use to pull a 5MB test file every 15 minutes in order to measure downlink throughput. We use both Panopta and Pingdom to monitor uptime. Panopta does a better job than Pingdom at monitoring by verifying outages with 3 different monitors before triggering an outage. This has resulted in fewer false positives than we experience with Pingdom. We also manually verify most downtime periods with the service provider and exclude maintenance periods. We have the results of these ongoing tests posted here:

Uptime % for each cloud service:

Pingdom monitoring public report (based on a 5MB test file):

No comments:

Post a Comment