January 27, 2012

The Legal Ramifications of Cloud Outages

Here's a public service announcement for cloud customers and cloud service providers alike: If you're not doing something to significantly increase the reliability of your cloud systems, you should prepare your legal team.

Take a look at this article; it's a great primer to get everyone started: 5 Key Considerations When Litigating Cloud Computing Disputes by Gerry Silver, partner at Chadbourne & Parke.

I agree with Silver who sums up the situation nicely when he says that "given the ever-increasing reliance on cloud computing, it is inevitable that disputes and litigation will increase between corporations and cloud service providers."

Understandably, both cloud users and cloud providers will want to dodge responsibility for cloud outages. "The corporation may be facing enormous liability and will seek to hold the cloud provider responsible, while the cloud provider will undoubtedly look to the parties' agreement and the underlying circumstances for defenses" [source].

Looks like the future is bright for attorneys who specialize in cloud issues. After all, a faulty power supply or software glitch could lead to years of court battles.

Five Legal Elements

Silver outlines five key elements for the legal team to consider:
  • Limitation of liability written into service contracts.
  • Whether the Limitation of Liability clause can be circumvented: can the cloud provider be held responsible despite this clause?
  • Contract terms: A breach of contract on either side can greatly affect litigation outcomes.
  • Remedies: During the crisis the corporation could demand that the cloud provider takes extraordinary steps to restore systems and data.
  • Insurance and indemnification: Insurance may cover some losses, and a third party may bear some responsibility for the problem too.

The Disturbing News: Expectations are Low

In my travels, I am still surprised at how little thought goes into the liability associated with an outage whether it be in a data center, cloud or hybrid configuration. Although I embrace everyone’s motivation to move to the cloud, I found a couple of points in Silver's article disturbing because they shed light on the obsolete way the tech industry thinks about cloud architecture as it relates to disaster prevention.

1) Just how much foresight is a cloud provider legally expected to have? In the section titled "May the Limitation of Liability Clause Be Circumvented?" Silver describes how "one court recently sustained a claim of gross negligence and/or recklessness in a cloud computing/loss of data case because it was alleged that the provider failed to take adequate steps to protect the data." This raises the question of what constitutes "failure to take adequate steps". Does it mean that the provider did something genuinely negligent like setting up a system with multiple single points of failure? Were they culpable because they had followed best practices and relied upon an industry-standard failover-based recovery system which later failed? Or did they fail to seek out (or create) the most advanced and reliable proactive business continuity system on the planet? Whatever they were using probably seemed good at the time but was clearly not adequate because it failed to protect the customer's data.

I would speculate that a customer’s lawyer would have a pretty high expectation of what "adequate steps" are, but as you will see in my next point the bar is still set pretty low.

2) The expectation is that cloud providers will be using failover, which is 20 years out of date. In the same section, Silver asks "Were back-ups of data stored in different regions? Were banks of computers isolated from one another ready to take over if another zone failed?" This without doubt describes a failover system. Apparently his expectation is that a cloud provider should follow current best practices and use a failover disaster recovery system. But the failover technique was designed decades ago for systems that are now extinct or nearly so. The latest networks are radically more sophisticated than their forebears and consequently have radically different requirements. Even a successful failover is a perilous thing, and failovers fail all the time. If they didn't, Mr. Silver would probably not have found it necessary to write this article. Backups happen only on fixed schedules so the most recent transactions are often lost during a disaster. You can expect legal battles over downtime and data loss to continue because cloud providers and their customers are all using one variation or another of these outdated disaster recovery techniques. So how can a disaster recovery system that is so prone to disaster be considered an "adequate step?"

Like I said, you'd better call a meeting with your legal counsel and get ready.

No Outage, No Litigation

ZeroNines can actually eliminate outages. Our Always Available™ technology processes all network transactions simultaneously and in parallel on multiple cloud nodes or servers that are geographically separated. If something fails and brings down Cloud Node A, Nodes B and C continue processing everything as if nothing had happened. There is no hierarchy and no failover. So if this cloud provider's service does not go offline there is no violation of SLAs and no cause for litigation.

Our approach to business continuity is far superior to the failover paradigm, offering in excess of five nines (>99.999%) of uptime. It is suitable for modern generations of clouds, virtual servers, traditional servers, colocation hosting, in-house servers, and the applications and databases that clients will want to run in all of these.

So my message to cloud providers is to check out ZeroNines and Always Available as a means of protecting your service from downtime and the litigation that can come with it.

My message to cloud customers is that you can apply ZeroNines and Always Available whether your cloud provider is involved or not. After all, your key interest here is to maintain business continuity, not to win a big settlement over an outage.

And heads-up to the lawyers on both sides: We are setting a new standard in what constitutes "adequate steps".

Visit the ZeroNines website to find out more about how our disaster-proof architecture protects businesses of any description from downtime.

Alan Gin – Founder & CEO, ZeroNines

January 23, 2012

RIM co-CEOs Resign: Is This the Cost of Downtime?

Back in October I commented in this blog about the enormous RIM BlackBerry outage [source]. I wrote that "even a massive outage like this is unlikely to cause the demise of a large and important firm, but combined with other woes like a less-than-competitive product and poor business model it could well be the deciding factor."

And now for the fallout. RIM is still in business, but its beleaguered co-CEOs/co-Chairmen Jim Balsillie and Mike Lazaridis have resigned and taken other positions within the company [source]. I'm sure it was not the outage alone (or all RIM outages put together) that caused this leadership shakeup. But it could well have been the deciding factor.

Outages and CEO Job Security

RIM's product problems are certainly serious. But I see a fundamental difference between 1) the prescience needed to get the right product to market at the right time, and 2) the technical ability to keep an existing product up and running. Customers might to some degree forgive a company whose product is reliable but behind the times. They will abandon if it doesn't work when they need it even if it is the newest, slickest thing around.

The October outage has RIM "facing a possible class action lawsuit in Canada" [source]. Add the cost of that in addition to the costs of recovery, customer abandonment, shareholder value and so forth. (Stay tuned; I will be commenting on the legal issues around cloud outages in the next few days.)

To put RIM's decline in perspective, the company was worth $70 billion a few years ago but today has a market value of about $8.9 billion [source]. Their stock dropped about 75% last year and was down to $16.28 before the market opened on Monday January 23, 2012 [source].

So according to the rules of modern business, someone has to pay and in this case it is the CEOs.

Now Imagine This at a Smaller Company

Can you imagine a three-day outage at a smaller software company? Or even a one-day outage? Imagine a typical e-commerce technology provider with 50 retail customers, 100 employees, and an SaaS application. If the core application, image server, database server, customer care system, inventory system, orders & fulfillment system, or other key element goes down that could be the end of them. Many smaller companies do not survive a significant downtime event. And many smaller retailers do not survive if they are unable to do business on a key shopping day such as Black Friday or Cyber Monday.

Or even if the email system goes down for a couple hours. It happens all the time. Email is a key element of workflow and productivity and what company can afford to sit still for even a couple hours?

It's more than the CEO whose job is at risk. Here's where an ounce of prevention is worth far more than a pound of cure.

That Rickety Old Failover

Remember my earlier comment about outdated yet reliable products, versus outdated and unreliable products? Ironically, the failover disaster recovery model that failed RIM back in October is one of those old and unreliable products. It was designed for systems and architectures that no longer bear any resemblance to what businesses are actually using. If failover worked I would not be writing this because there would be no need for its replacement.

But if you want to find out about real business continuity and getting away from failover, take a look at ZeroNines. Our Always Available™ architecture processes in multiple cloud locations, on multiple servers, and in multiple nodes. There is no hierarchy so if one goes down the others continue processing all network transactions. ZeroNines can bring application uptime to virtually 100%. It is a complete departure from the failover that RIM is using, and that small businesses everywhere stake their futures upon.

Time Will Tell

"RIM earned its reputation by focusing relentlessly on the customer and delivering unique mobile communications solutions… We intend to build on this heritage to expand BlackBerry's leadership position," RIM's new CEO Thorsten Heins is quoted as saying [source].

Let's hope this "focus on the customer" also includes a strategic initiative to build genuine uptime and availability, or maybe we'll be reading about another new RIM CEO next January.

Visit the ZeroNines website to find out more about how our disaster-proof architecture protects businesses of any description from downtime.

Alan Gin – Founder & CEO, ZeroNines