Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What about hardware failure? On AWS you just commission a new instance and your downtime is minutes rather than hours, plus you don't have to keep extra hardware on hand just to avoid downtime of days. There are also smaller more localized issues like network switch failure and other things that you probably never even notice on Amazon, but might be more likely to bite you on a dedicated host.

If an AWS data center goes down it gets a lot of press, but does it actually outweigh the sum of all dedicated/shared/vps hosting issues on the equivalent volume?



There are some nice middle options out there. I'll use Softlayer as an example as I have provisioned a lot of machinery over there.

I can order machines online and SSH in 3-4 hours later. Even exotic stuff they turn around just as fast - we saw that speed on a quad octocore box with a raid 10 of Intel SSDs.

That's real metal too, with real IO (most of my work is IO bound so VMs and the cloud are not options). You get to pick the exact CPUs, disks, etc and they slot them in solid Super Micro boards and use good Adaptec disk controllers. You pay monthly and can spin down the box at any time (though must pay full months, no per-minute pricing like AWS).

That is on the dedicated hardware side, you can also spin up compute instances and those can be cloned and fired up in bulk. But, they also have the IO problems that all other VMs have.

In any case, just wanted to mention they are a decent middle ground. Not as automated and polished as Amazon on the VM side but you can spin up mixtures of metal and VMs to get combinations that make sense - pushing compute or RAM-only stuff to VMs and keeping DBs and persistence layers on real metal. They have a few different datacenters too so you can spread gear around physical locations.


I'm fairly sure that my downtime due to a hardware failure at softlayer would be less than the downtime AWS has had for huge numbers of people this year. And hardware failures on a given server happen less frequently than 1/year on average.

Problems are just not as common if you're running on a handful of dedicated machines, and a single dedicated machine at a good host can handle a LOT without having to do all the crazy reliability engineering that running on AWS requires. You need backups, but you don't need that same assumption that you need to be able to failover instantly or you will have guaranteed downtime sometime soon. I don't think that that difference can be overstated, since it lets you focus on more important things.


Speaking of Softlayer specifically, they've diagnosed then replaced failed hardware for me (hard disks and power supplies so far) in 15-30 minutes from the time I opened a support ticket. One of the incidents was around 2AM local time where the server is and their response time was the same.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: