The standard deviation is simply a measure of the spread of a distribution. There's nothing wrong or right about high standard deviations. In fact, the high standard deviation means that you should expect highly variant performance.
Look at the figures. The performance of Slicehost follows a sawtooth like pattern. The quantity standard deviation is useful because it quantifies what to expect. Plus or minus one standard deviation means that ~ 2/3 of the time you will fall in that range.
If you think about the problem a little bit, you might be more worried about the standard deviation of the standard deviation. This, in fact, would be a useful quantity, but hard to measure.
EDIT below this line -------
Several comments below have commented that SD is somehow less useful if it's "large" (or large relative to the mean, or whatever). The reason people think large SDs are indicative of a poor experiment is that in school lab classes one calculates the SD and call it the "error".
The standard deviation is a measure of spread, if it's large then the spread is large. Knowing the spread has value. In this case, under the parent's experimental conditions EC2's performance is more constant than that of slicehost's.
A fair critique of the blog posting is that the error on the standard deviation may be large, depending on the experimental conditions. It is _not_ a fair critique to say that the SD is too high to make a prediction, you just have larger performance spread. Note that the performance spread described is not necessarily "error". The spread is inherit to either the server (as implied by the article) or the method (in which case it is an error).
I understand the performance variance characteristics, what concerns me about these tests is that the graphs of them are not continuous, they seem to be immediate dips instead of gradual curves. This indicates to me that the sample size wasn't large enough, or at least the graph needs a higher resolution if data exists to support it. Also, at the bottom, the article gives the numerical data, with the mean and standard deviation. However, the mean for the hosts with high standard deviations are essentially useless, because the standard deviation is so high. If we are going to compare average performance between cloud hosts, lets at least have useful averages to base our opinion on.
I think that low standard deviation / variance is desirable. It provides a level of performance that can be predictable and reliable. Then just multiply according to your processing needs.
Look at the figures. The performance of Slicehost follows a sawtooth like pattern. The quantity standard deviation is useful because it quantifies what to expect. Plus or minus one standard deviation means that ~ 2/3 of the time you will fall in that range.
If you think about the problem a little bit, you might be more worried about the standard deviation of the standard deviation. This, in fact, would be a useful quantity, but hard to measure.
EDIT below this line ------- Several comments below have commented that SD is somehow less useful if it's "large" (or large relative to the mean, or whatever). The reason people think large SDs are indicative of a poor experiment is that in school lab classes one calculates the SD and call it the "error".
The standard deviation is a measure of spread, if it's large then the spread is large. Knowing the spread has value. In this case, under the parent's experimental conditions EC2's performance is more constant than that of slicehost's.
A fair critique of the blog posting is that the error on the standard deviation may be large, depending on the experimental conditions. It is _not_ a fair critique to say that the SD is too high to make a prediction, you just have larger performance spread. Note that the performance spread described is not necessarily "error". The spread is inherit to either the server (as implied by the article) or the method (in which case it is an error).