Microsoft/Danger/T-Mobile to millions of Sidekick users: Whoops.
Short version: Microsoft (who now owns Danger, the makers of the Sidekick) decided to migrate data from one storage network to another. That migration failed, and corrupted the data. Okay, annoying, so restore from the backup, right?
Wrong. No backups. None. Zero. El zilcho.
So millions of Sidekick users awake this past weekend to find that all of their data are gone -- or, in the best scenario, the only data they have are the most recent stuff on the Sidekick itself, and if they let the device power down, they'll lose that, too.
You can't say I didn't warn you.
January 19, 2009 - "Dark Clouds":
Here's where we get to the heart of the problem. Centralization is the core of the cloud computing model, meaning that anything that takes down the centralized service -- network failures, massive malware hit, denial-of-service attack, and so forth -- affects everyone who uses that service. When the documents and the tools both live in the cloud, there's no way for someone to continue working in this failure state. If users don't have their own personal backups (and alternative apps), they're stuck.
Similarly, if a bug affects the cloud application, everyone who uses that application is hurt by it. [...]
In short, the cloud computing model envisioned by many tech pundits (and tech companies) is a wonderful system when it works, and a nightmare when it fails. And the more people who come to depend upon it, the bigger the nightmare. For an individual, a crashed laptop and a crashed cloud may be initially indistinguishable, but the former only afflicts one person and one point of access to information. If a cloud system locks up, potentially millions of people lose access.
So what does all of this mean?
My take is that cloud computing, for all of its apparent (and supposed) benefits, stands to lose legitimacy and support (financial and otherwise) when the first big, millions-of-people-affecting, failure hits. Companies that tie themselves too closely to this particular model, as either service providers or customers, could be in real trouble.
And what do we see now? "Microsoft's Danger Sidekick data loss casts dark cloud on cloud computing." "Microsoft's Sidekick data catastrophe." "Cloud Goes Boom, T-Mo Sidekick Users Lose All Data."
Okay, it's easy to blame the failure to make backups for this disaster. But the point of resilience models is that failure happens. A complex system should not be so brittle that a single mistake can destroy it. Here's what I wrote back in January about what a resilient cloud could look like:
Distributed, individual systems would remain the primary tool of interaction with one's information. Data would live both locally and on the cloud, with updates happening in real-time if possible, delayed if necessary, but always invisibly. All cloud content should be in open formats, so that alternative tools can be used as desired or needed. Ideally, a personal system should be able to replicate data to multiple distinct clouds, to avoid monoculture and single-point-of-failure problems. This version of the cloud is less a primary source for computing services, and more a fail-safe repository. If my personal system fails, all of my data remains available and accessible via the cloud; if the cloud fails, all of my data remains available and accessible via my personal system.
It may not be as sexy as everything-on-the-cloud models, and undoubtedly not as profitable, but a failure like this past weekend's Microsoft/Danger fiasco -- or the myriad cloud failures yet to happen (and they will happen) -- simply wouldn't have been possible.