Monday, April 25, 2011

How SmugMug survived the Amazon Outage

SmugMug, a photo-sharing site, says four simple things allowed it to survive the recent Amazon Web Services outage, despite its reliance on Amazon and cloud computing infrastructure. Geographical dispersion is the first principle. SmugMug uses "multiple Availability Zones." When there is a problem at one of the three centers, service continues from the other sites. The company also "designed for failure," assuming there would be a major outage at some point, requiring backup systems and components.

SmugMug does not use "Elastic Block Storage," which failed during the recent outage. SmugMug also does not rely completely on cloud computing. "The exact types of data that would have potentially been disabled by the EBS meltdown don’t actually live at AWS at all; it all still lives in our own datacenters," says company CEO Don MacAskill.

The advice is obvious. When using cloud computing facilities, an organization requires the same level of redundancy as when using facilities on the premises, or in an owned data center.

No comments:

Mergers, Joint Ventures or Investments as Routes to Controlling AI Model Costs

Just how artificial intelligence model providers might improve their economics is a key business model issue.  A shift to inference operatio...