<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2171310898568852604</id><updated>2012-01-27T10:38:25.337-08:00</updated><category term='Twitter'/><category term='Microsoft'/><category term='Netflix'/><category term='Amazon'/><category term='Emerson'/><category term='retail'/><category term='PayPal'/><category term='privacy'/><category term='Apple'/><category term='business continuity'/><category term='ITIC'/><category term='lawyer'/><category term='financial'/><category term='ZeroNines'/><category term='Heins'/><category term='medical'/><category term='FAA airline Always Available'/><category term='PHR'/><category term='Reddit'/><category term='uptime'/><category term='Dell'/><category term='outage'/><category term='Technology Review'/><category term='continuity'/><category term='Always Available'/><category term='attorney'/><category term='email'/><category term='lawsuit'/><category term='disaster recovery'/><category term='Sidekick'/><category term='EC2'/><category term='BAE'/><category term='Balsillie'/><category term='Office 365'/><category term='Foursquare'/><category term='Victoria&apos;s Secret'/><category term='EMR'/><category term='judgement'/><category term='Stratus Technologies'/><category term='downtime'/><category term='cloud storage'/><category term='Cyber Monday'/><category term='Cloud Computing'/><category term='security'/><category term='Gmail'/><category term='FAA airline'/><category term='Instagram'/><category term='cloud'/><category term='litigation'/><category term='Google'/><category term='failover'/><category term='blackberry'/><category term='Lazaridis'/><category term='Black Friday'/><category term='failsafe'/><category term='TD AMERITRADE'/><category term='Haiti'/><category term='uptime survey'/><category term='health'/><category term='UPS'/><category term='Ireland'/><category term='T-Mobile'/><category term='RIM'/><title type='text'>ZeroNines® - Always Available™</title><subtitle type='html'>A look at news-making network application downtime events, service provider disasters, infrastructure failures, and cloud computing outages that could have been prevented. Most SaaS and web software and storage providers rely on outmoded disaster recovery methods like failover and backup that offer no real protection for critical applications and data. ZeroNines offers uptime in excess of five nines for true disaster prevention among virtualized environments, enterprise networks, and the cloud.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>21</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-3832494120515476771</id><published>2012-01-27T09:35:00.000-08:00</published><updated>2012-01-27T09:35:47.277-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='judgement'/><category scheme='http://www.blogger.com/atom/ns#' term='litigation'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='lawyer'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='lawsuit'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='attorney'/><title type='text'>The Legal Ramifications of Cloud Outages</title><content type='html'>Here's a public service announcement for cloud customers and cloud service providers alike: If you're not doing something to significantly increase the reliability of your cloud systems, you should prepare your legal team. &lt;br /&gt;&lt;br /&gt;Take a look at this article; it's a great primer to get everyone started: &lt;a href="http://www.law.com/jsp/article.jsp?id=1202538226208&amp;_Key_Considerations_When_Litigating_Cloud_Computing_Disputes&amp;slreturn=1"&gt;5 Key Considerations When Litigating Cloud Computing Disputes&lt;/a&gt; by Gerry Silver, partner at Chadbourne &amp; Parke.&lt;br /&gt;&lt;br /&gt;I agree with Silver who sums up the situation nicely when he says that "given the ever-increasing reliance on cloud computing, it is inevitable that disputes and litigation will increase between corporations and cloud service providers." &lt;br /&gt;&lt;br /&gt;Understandably, both cloud users and cloud providers will want to dodge responsibility for cloud outages. "The corporation may be facing enormous liability and will seek to hold the cloud provider responsible, while the cloud provider will undoubtedly look to the parties' agreement and the underlying circumstances for defenses" [&lt;a href="http://www.law.com/jsp/article.jsp?id=1202538226208&amp;_Key_Considerations_When_Litigating_Cloud_Computing_Disputes&amp;slreturn=1"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;Looks like the future is bright for attorneys who specialize in cloud issues. After all, a faulty power supply or software glitch could lead to years of court battles.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Five Legal Elements&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Silver outlines five key elements for the legal team to consider:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Limitation of liability written into service contracts.&lt;/li&gt;&lt;li&gt;Whether the Limitation of Liability clause can be circumvented: can the cloud provider be held responsible despite this clause?&lt;/li&gt;&lt;li&gt;Contract terms: A breach of contract on either side can greatly affect litigation outcomes.&lt;/li&gt;&lt;li&gt;Remedies: During the crisis the corporation could demand that the cloud provider takes extraordinary steps to restore systems and data.&lt;/li&gt;&lt;li&gt;Insurance and indemnification: Insurance may cover some losses, and a third party may bear some responsibility for the problem too.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;The Disturbing News: Expectations are Low&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In my travels, I am still surprised at how little thought goes into the liability associated with an outage whether it be in a data center, cloud or hybrid configuration. Although I embrace everyone’s motivation to move to the cloud,  I found a couple of points in Silver's article disturbing because they shed light on the obsolete way the tech industry thinks about cloud architecture as it relates to disaster prevention. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;1) Just how much foresight is a cloud provider legally expected to have?&lt;/b&gt; In the section titled "May the Limitation of Liability Clause Be Circumvented?" Silver describes how "one court recently sustained a claim of gross negligence and/or recklessness in a cloud computing/loss of data case because it was alleged that the provider failed to take adequate steps to protect the data." This raises the question of what constitutes "failure to take adequate steps". Does it mean that the provider did something genuinely negligent like setting up a system with multiple single points of failure? Were they culpable because they had followed best practices and relied upon an industry-standard failover-based recovery system which later failed? Or did they fail to seek out (or create) the most advanced and reliable proactive business continuity system on the planet? Whatever they were using probably seemed good at the time but was clearly not adequate because it failed to protect the customer's data. &lt;br /&gt;&lt;br /&gt;I would speculate that a customer’s lawyer would have a pretty high expectation of what "adequate steps" are, but as you will see in my next point the bar is still set pretty low.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;2) The expectation is that cloud providers will be using failover, which is 20 years out of date.&lt;/b&gt; In the same section, Silver asks "Were back-ups of data stored in different regions? Were banks of computers isolated from one another ready to take over if another zone failed?" This without doubt describes a failover system. Apparently his expectation is that a cloud provider should follow current best practices and use a failover disaster recovery system. But the failover technique was designed decades ago for systems that are now extinct or nearly so. The latest networks are radically more sophisticated than their forebears and consequently have radically different requirements. Even a successful failover is a perilous thing, and failovers fail all the time. If they didn't, Mr. Silver would probably not have found it necessary to write this article. Backups happen only on fixed schedules so the most recent transactions are often lost during a disaster. You can expect legal battles over downtime and data loss to continue because cloud providers and their customers are all using one variation or another of these outdated disaster recovery techniques. So how can a disaster recovery system that is so prone to disaster be considered an "adequate step?" &lt;br /&gt;&lt;br /&gt;Like I said, you'd better call a meeting with your legal counsel and get ready.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;No Outage, No Litigation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;ZeroNines can actually eliminate outages. Our Always Available&amp;#153; technology processes all network transactions simultaneously and in parallel on multiple cloud nodes or servers that are geographically separated. If something fails and brings down Cloud Node A, Nodes B and C continue processing everything as if nothing had happened. There is no hierarchy and no failover. So if this cloud provider's service does not go offline there is no violation of SLAs and no cause for litigation.&lt;br /&gt;&lt;br /&gt;Our approach to business continuity is far superior to the failover paradigm, offering in excess of five nines (&amp;#062;99.999%) of uptime. It is suitable for modern generations of clouds, virtual servers, traditional servers, colocation hosting, in-house servers, and the applications and databases that clients will want to run in all of these.&lt;br /&gt;&lt;br /&gt;So my message to cloud providers is to check out ZeroNines and Always Available as a means of protecting your service from downtime and the litigation that can come with it. &lt;br /&gt;&lt;br /&gt;My message to cloud customers is that you can apply ZeroNines and Always Available whether your cloud provider is involved or not. After all, your key interest here is to maintain business continuity, not to win a big settlement over an outage.&lt;br /&gt;&lt;br /&gt;And heads-up to the lawyers on both sides: We are setting a new standard in what constitutes "adequate steps".&lt;br /&gt;&lt;br /&gt;Visit the ZeroNines website to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-3832494120515476771?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/3832494120515476771/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2012/01/legal-ramifications-of-cloud-outages.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/3832494120515476771'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/3832494120515476771'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2012/01/legal-ramifications-of-cloud-outages.html' title='The Legal Ramifications of Cloud Outages'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-5361259006517320441</id><published>2012-01-23T15:08:00.000-08:00</published><updated>2012-01-23T15:11:47.350-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RIM'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Balsillie'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='Heins'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='blackberry'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Lazaridis'/><title type='text'>RIM co-CEOs Resign: Is This the Cost of Downtime?</title><content type='html'>Back in October I commented in this blog about the enormous RIM BlackBerry outage [&lt;a href="http://zeronines.blogspot.com/2011/10/what-did-one-blackberry-user-say-to.html"&gt;source&lt;/a&gt;]. I wrote that "even a massive outage like this is unlikely to cause the demise of a large and important firm, but combined with other woes like a less-than-competitive product and poor business model it could well be the deciding factor."&lt;br /&gt;&lt;br /&gt;And now for the fallout. RIM is still in business, but its beleaguered co-CEOs/co-Chairmen Jim Balsillie and Mike Lazaridis have resigned and taken other positions within the company [&lt;a href="http://www.foxbusiness.com/technology/2012/01/23/rim-shuffles-top-management-amid-investor-pressure/"&gt;source&lt;/a&gt;]. I'm sure it was not the outage alone (or all RIM outages put together) that caused this leadership shakeup. But it could well have been the deciding factor.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Outages and CEO Job Security&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;RIM's product problems are certainly serious. But I see a fundamental difference between 1) the prescience needed to get the right product to market at the right time, and 2) the technical ability to keep an existing product up and running. Customers might to some degree forgive a company whose product is reliable but behind the times. They will abandon if it doesn't work when they need it even if it is the newest, slickest thing around.&lt;br /&gt;&lt;br /&gt;The October outage has RIM "facing a possible class action lawsuit in Canada" [&lt;a href="http://news.cnet.com/8301-1035_3-57363615-94/rims-co-ceos-step-down-insider-heins-takes-helm/"&gt;source&lt;/a&gt;]. Add the cost of that in addition to the costs of recovery, customer abandonment, shareholder value and so forth. (Stay tuned; I will be commenting on the legal issues around cloud outages in the next few days.)&lt;br /&gt;&lt;br /&gt;To put RIM's decline in perspective, the company was worth $70 billion a few years ago but today has a market value of about $8.9 billion [&lt;a href="http://www.foxnews.com/scitech/2012/01/22/research-in-motion-ceos-to-step-down-as-part-overhaul/"&gt;source&lt;/a&gt;]. Their stock dropped about 75% last year and was down to $16.28 before the market opened on Monday January 23, 2012 [&lt;a href="http://www.foxbusiness.com/technology/2012/01/23/rim-shuffles-top-management-amid-investor-pressure/"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;So according to the rules of modern business, someone has to pay and in this case it is the CEOs.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Now Imagine This at a Smaller Company&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Can you imagine a three-day outage at a smaller software company? Or even a one-day outage? Imagine a typical e-commerce technology provider with 50 retail customers, 100 employees, and an SaaS application. If the core application, image server, database server, customer care system, inventory system, orders &amp;amp; fulfillment system, or other key element goes down that could be the end of them. Many smaller companies do not survive a significant downtime event. And many smaller retailers do not survive if they are unable to do business on a key shopping day such as Black Friday or Cyber Monday.&lt;br /&gt;&lt;br /&gt;Or even if the email system goes down for a couple hours. It happens all the time. Email is a key element of workflow and productivity and what company can afford to sit still for even a couple hours?&lt;br /&gt;&lt;br /&gt;It's more than the CEO whose job is at risk. Here's where an ounce of prevention is worth far more than a pound of cure.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;That Rickety Old Failover&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Remember my earlier comment about outdated yet reliable products, versus outdated and unreliable products? Ironically, the failover disaster recovery model that failed RIM back in October is one of those old and unreliable products. It was designed for systems and architectures that no longer bear any resemblance to what businesses are actually using. If failover worked I would not be writing this because there would be no need for its replacement.&lt;br /&gt;&lt;br /&gt;But if you want to find out about real business continuity and getting away from failover, take a look at ZeroNines. Our Always Available™ architecture processes in multiple cloud locations, on multiple servers, and in multiple nodes. There is no hierarchy so if one goes down the others continue processing all network transactions. ZeroNines can bring application uptime to virtually 100%. It is a complete departure from the failover that RIM is using, and that small businesses everywhere stake their futures upon.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Time Will Tell&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;"RIM earned its reputation by focusing relentlessly on the customer and delivering unique mobile communications solutions… We intend to build on this heritage to expand BlackBerry's leadership position," RIM's new CEO Thorsten Heins is quoted as saying [&lt;a href="http://news.cnet.com/8301-1035_3-57363615-94/rims-co-ceos-step-down-insider-heins-takes-helm/"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;Let's hope this "focus on the customer" also includes a strategic initiative to build genuine uptime and availability, or maybe we'll be reading about another new RIM CEO next January.&lt;br /&gt;&lt;br /&gt;Visit the ZeroNines website to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-5361259006517320441?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/5361259006517320441/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2012/01/rim-co-ceos-resign-is-this-cost-of.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/5361259006517320441'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/5361259006517320441'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2012/01/rim-co-ceos-resign-is-this-cost-of.html' title='RIM co-CEOs Resign: Is This the Cost of Downtime?'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-7205095458059964607</id><published>2011-12-12T18:06:00.000-08:00</published><updated>2011-12-12T18:06:44.168-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Office 365'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='BAE'/><category scheme='http://www.blogger.com/atom/ns#' term='Ireland'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>BAE, Microsoft, the Cloud, and Planning for When it All Goes Horribly Wrong</title><content type='html'>&lt;i&gt;"If it fails in Ireland, it goes to Holland. But what if it fails in Holland as well?"&lt;/i&gt;&lt;br /&gt;Paraphrase of Charles Newhouse, BAE [&lt;a href="http://www.crn.com/news/cloud/232300148/report-patriot-act-fears-squash-uk-defense-companys-microsoft-cloud-plan.htm;jsessionid=2bby9Jab6odd45qGZeYT0g**.ecappj03]"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Cloud news circuits have been abuzz the last few days over BAE rejecting Microsoft's Office 365 cloud solution because of the Patriot Act. This is the highest-profile rejection of a cloud offering I have seen. I am shocked and dismayed that after all the advancements that have improved continuity in the cloud, the network architectures our cloud service providers are offering are still in the stone age. They're still trying to use failover and pass it off as advanced and reliable. I can only assume that if given a 787 they would try to fly it off a dirt landing strip.&lt;br /&gt;&lt;br /&gt;When you read the articles closely, it is clear that the big issue for BAE was data sovereignty. How does one retain control of data during a network disaster, and where does it go when your service provider has to failover from the primary network node to the backup? To quote Charles Newhouse, head of strategy and design at British defense contractor BAE,&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;"We had these wonderful conversations with Microsoft where we were going to adopt Office 365 for some of our unrestricted stuff, and it was all going to be brilliant. I went back and spoke to the lawyers and said, '[The data center is in] Ireland and then if it fails in Ireland go to Holland.' And the lawyers said 'What happen[s] if they lose Holland as well?'" [&lt;a href="http://www.crn.com/news/cloud/232300148/report-patriot-act-fears-squash-uk-defense-companys-microsoft-cloud-plan.htm;jsessionid=WEFuM6L6EiFyP0puqLf3AQ**.ecappj02"&gt;source&lt;/a&gt;]&lt;/blockquote&gt;&lt;br /&gt;And earlier in the same article he described the user experience during a cloud outage:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;"A number of high profile outages that users have suffered recently demonstrated just how little control you actually have. When it all goes horribly wrong, you just sit there and hope it is going to get better. There's nothing tangibly you can do to assist" [&lt;a href="http://www.crn.com/news/cloud/232300148/report-patriot-act-fears-squash-uk-defense-companys-microsoft-cloud-plan.htm;jsessionid=WEFuM6L6EiFyP0puqLf3AQ**.ecappj02"&gt;source&lt;/a&gt;].&lt;/blockquote&gt;&lt;br /&gt;&lt;b&gt;It's About More than Just the Patriot Act&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The big focus in these articles is the Patriot Act. BAE lawyers forbade the use of Office 365 and the Microsoft public cloud because as a U.S. company, Microsoft could be required to turn BAE data over to the U.S. government under terms of the Patriot Act [&lt;a href="http://www.crn.com/news/cloud/232300148/report-patriot-act-fears-squash-uk-defense-companys-microsoft-cloud-plan.htm;jsessionid=WEFuM6L6EiFyP0puqLf3AQ**.ecappj02"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;It is true that the Patriot Act can require cloud service providers like Microsoft (and Amazon, Google, and others) to give the U.S. government the data on their servers, even if those servers are housed outside the United States [&lt;a href="http://www.zdnet.com/blog/igeneration/microsoft-admits-patriot-act-can-access-eu-based-cloud-data/11225"&gt;source&lt;/a&gt;]. Newhouse also said that "the geo-location of that data and who has access to that data is the number one killer for adopting to the public cloud at the moment" [&lt;a href="http://www.crn.com/news/cloud/232300148/report-patriot-act-fears-squash-uk-defense-companys-microsoft-cloud-plan.htm;jsessionid=WEFuM6L6EiFyP0puqLf3AQ**.ecappj02"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;But European governments are already moving to eliminate this loophole. As explained in November on ZDNet.com, a new European directive "will not only modernize the data protection laws, but will also counteract the effects of the Patriot Act in Europe" [&lt;a href="http://www.zdnet.com/blog/london/updated-european-law-will-close-patriot-act-data-access-loophole/742?tag=content;siu-container"&gt;source&lt;/a&gt;]. Sounds to me like Microsoft's jurisdictional problems will be solved for them. And failing that there is probably some creative and legal business restructuring that would do the trick.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It's Really about Failover and its Shortcomings&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So if European law will provide data sovereignty from a legal standpoint, why reject the Microsoft cloud? It all comes back to "when things go horribly wrong."&lt;br /&gt;&lt;br /&gt;When Newhouse describes the Ireland-to-Holland scenario, he is clearly talking about Microsoft failing-over from their Ireland datacenter to their Holland datacenter. I find it hard to believe that Microsoft thinks the outdated and flawed failover model is suitable for a leading cloud offering. Office 365 and their customers deserve better.&lt;br /&gt;&lt;br /&gt;Apparently BAE agrees. It put its foot down and refused to play because the reality does not match the promise.&lt;br /&gt;&lt;br /&gt;Failovers often fail, causing the downtime they were supposed to prevent. If the secondary site fails to start up properly (which is very common) or suffers an outage of its own, the business is either a) still offline or b) failed over to yet another location. The customer quickly loses control, network transactions get lost, and their data goes… where? Another server in Europe? Part of an American cloud? How many locations is Microsoft prepared to failover to, and where are they? And with the cloud these issues loom even larger because there is no particular machine that houses the data.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Solution: Cloud and Data Reliability without Failover&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;ZeroNines offers two potential scenarios that will solve this problem: &lt;br /&gt;&lt;br /&gt;1) Prevent downtime on Protect the cloud provider's systems from downtime, offering a far more reliable cloud.&lt;br /&gt;&lt;br /&gt;2) Protect the business' systems from a cloud provider's downtime.&lt;br /&gt;&lt;br /&gt;Our Always Available technology is designed to provide data and application uptime well in excess of five nines. ZenVault Medical has been running in the cloud on Always Available for about 14 months with true 100% uptime. Always Available runs multiple network and cloud nodes in distant geographical areas. All servers and nodes are hot, and all applications are active. If one fails, the others continue processing as before, with no interruption to the business or the user experience. There is no failover, and thus no chance for outages caused by a failed failover. &lt;br /&gt;&lt;br /&gt;So if Microsoft were to adopt our Always Available technology, a storm like the one that knocked out their data center in Ireland this past August would not affect service. The Ireland node might go down, but all network activities would proceed as usual on other cloud data centers in Holland, Italy, or wherever they have set them up. Users would never know it.&lt;br /&gt;&lt;br /&gt;If BAE adopted Always Available, they could bring their Microsoft cloud node into an Always Available array with other cloud nodes or data centers of their own choosing. A failure in one simply means that business proceeds on the others. &lt;br /&gt;&lt;br /&gt;The business or the service provider can determine which nodes are brought into the array. BAE could choose to use only European cloud nodes to maintain data sovereignty.&lt;br /&gt;&lt;br /&gt;ZeroNines' Always Available technology is built precisely for the moment "when it all goes horribly wrong." The difference is that with ZeroNines, it won't mean downtime.&lt;br /&gt;&lt;br /&gt;Visit the ZeroNines website to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-7205095458059964607?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/7205095458059964607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2011/12/bae-microsoft-cloud-and-planning-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7205095458059964607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7205095458059964607'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2011/12/bae-microsoft-cloud-and-planning-for.html' title='BAE, Microsoft, the Cloud, and Planning for When it All Goes Horribly Wrong'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-7665251522116659057</id><published>2011-12-01T16:01:00.000-08:00</published><updated>2011-12-02T16:41:52.482-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Victoria&apos;s Secret'/><category scheme='http://www.blogger.com/atom/ns#' term='retail'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='Dell'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='Apple'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='Black Friday'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Cyber Monday'/><title type='text'>Retail Business Continuity on Black Friday and Cyber Monday</title><content type='html'>&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;The economy has heaved a sigh of relief after good sales reports from the Thanksgiving weekend. Have you stopped to really think about the importance of reliable IT systems and business continuity during this and other key sales events?&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;A company really may live or die according to what it or its service provider does in preparation for Black Friday and Cyber Monday. The game is in the hands of the technicians more and more every year.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;While these two days can herald great things during a good year, they can also seem like harbingers of doom if things don't go so well. Their grim-sounding names are oddly appropriate, and everyone watches with trepidation.&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;• Black Friday.&lt;/b&gt; Very ominous, evoking images of stock market crashes and other disasters. A few decades ago it came to mean "the day after Thanksgiving in which retailers make enough sales to put themselves 'into the black ink'" [&lt;a href="http://www.investorwords.com/482/Black_Friday.html"&gt;source&lt;/a&gt;] which is actually a good thing.&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;• Cyber Monday.&lt;/b&gt; Sounds like something from The Terminator. Actually… "The term 'Cyber Monday' was coined in 2005 by Shop.org, a division of the National Retail Federation [&lt;a href="http://www.shop.org/cybermonday#cyber_made_up"&gt;source]&lt;/a&gt;." This is the Monday after Thanksgiving, when online sales show a significant spike. Cyber Monday has become a major shopping day and economic indicator in its own right.&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;Jittery analysts are poised every year with their thumb on the Recession Early Warning button, ready to sound the alarm if the score doesn't add up and the game goes badly. (I think they secretly enjoy this.)&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;It's All IT's Fault. But No Pressure, Guys! : )&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;Every year in advance of this season opener, IT Managers beg for money to upgrade servers, replace old circuit breakers and backup batteries, service the cooling systems, and do a thousand other things to help prop up their networks for the onslaught. They also stock up on the coffee, donuts, and Valium that will keep them going through long days and even longer nights of watching, waiting, rebooting, hot swapping, and occasionally panicking over system crashes and failovers. I do not envy them, as the fate of the economy apparently rests upon their shoulders.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;If the IT systems go down the business is out of the game and the term "Black Friday" takes on an entirely new meaning. Revenue on Thanksgiving weekend is largely driven by time-sensitive discounts, so shoppers will buy from competitors if a website or point-of-sale (POS) system is down. For those of you running these systems, my heart goes out to you. I have been in similar situations myself many times.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Thanksgiving Weekend Outages Mostly Due to Heavy Traffic&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;There were a number of reports of ecommerce sites becoming unavailable on Thanksgiving, Black Friday, and Cyber Monday. Victoria's Secret went beyond secret and became downright invisible three separate times, for a total of about 80 minutes [&lt;a href="http://www.internetretailer.com/2011/11/26/outage-hits-victoriassecretcom-black-friday"&gt;source]&lt;/a&gt;. I have read about downtime and poor site performance at many other online retailers as well, including PC Mall and Crutchfield [&lt;a href="http://www.internetretailer.com/2011/11/28/thanksgiving-weekend-brings-multiple-site-headaches"&gt;source&lt;/a&gt;]. Universally, there is no mention of the cause of all this downtime, but the implication is that it was simple old-fashioned traffic overload.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Fire Suppression System Suppresses Sales on eBay&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;One outage not caused by traffic was ProStores, an online store solution used by lots of smaller operations to run their eBay storefronts. According to a Thanksgiving Day post by ProStores on their discussion board, "the data center fire suppression system tripped the Emergency Power Off (EPO) system causing a loss of power to the data center's raised floor environment" [&lt;a href="https://discussionboard.prostores.com/showthread.php?p=68898"&gt;source]&lt;/a&gt;. As is usual in such circumstances, it took most of the day before things could be brought back to normal. I strongly suggest you &lt;a href="https://discussionboard.prostores.com/showthread.php?p=68898"&gt;read their post&lt;/a&gt;, as it is an excellent account of the gyrations an IT department has to go through in such situations. I applaud ProStores for being so forthright and providing this information. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Preventing this and Other Outages&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;Always Available technology from ZeroNines could have prevented the ProStores outage entirely. Yes, that faulty fire suppression system would still have freaked out at that particular data center. But Always Available would have been running one, two, or more instances of the same applications and transactions in the cloud or at other data centers. ProStores clients and their customers would never have known there was a power outage and no sales would have been lost.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;ProStores made no mention at all of failover, so I assume they do not have a failover-based recovery system in place. With ZeroNines, that's perfectly fine because we do not use failover either. We make failover unnecessary. We offer disaster avoidance, not disaster recovery. There is no way to prevent all system malfunctions because there are too many complex parts. Next month maybe a circuit breaker will fail. After that, maybe it's a failed hard disk and an application crash. The list goes on.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Girding Your Loins for Next Year&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;Online retailers wanting to guard themselves against a Black Friday blackout (or on any other day) should consider the modular approach ZeroNines takes. You can apply Always Available to selected high-value systems such as:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;Webstore servers and databases&lt;/span&gt;&lt;/li&gt;&lt;span style="font-size: small;"&gt;&lt;li&gt;Product/inventory databases&lt;/li&gt;&lt;li&gt;Payment systems&lt;/li&gt;&lt;li&gt;Image rendering systems&lt;/li&gt;&lt;/span&gt;&lt;/ul&gt;&lt;span style="font-size: small;"&gt;These will keep you running if something blows up. Close behind are customer service systems and warehousing/fulfillment. These become more important the closer you get to Christmas, as last-minute shoppers tend to need more personal help and there is no leeway for late shipments.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;To prevent traffic-related outages, set up proper load balancing. If huge players like J.C. Penney, Apple, Macy's, Sears, Amazon, and Dell can come through Cyber Monday with flying colors [&lt;a href="http://www.internetretailer.com/2011/11/29/jc-penney-apple-and-macys-take-prize-site-performance"&gt;source]&lt;/a&gt;, you can too. But for the hardware failures, human mistakes, software crashes, and other things that can hit you any day of the year as well, look into ZeroNines. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;span style="font-size: small;"&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website&lt;/a&gt; to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoPlainText" style="font-family: Arial,Helvetica,sans-serif;"&gt;&lt;i&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-7665251522116659057?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/7665251522116659057/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2011/12/retail-business-continuity-on-black.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7665251522116659057'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7665251522116659057'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2011/12/retail-business-continuity-on-black.html' title='Retail Business Continuity on Black Friday and Cyber Monday'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-2044652649307813157</id><published>2011-10-24T17:22:00.000-07:00</published><updated>2011-10-24T17:22:42.905-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Emerson'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='UPS'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='Technology Review'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='FAA airline'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Building Outage Resistance into Network Operations</title><content type='html'>&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;An article I read the other day in MIT's Technology Review [&lt;a href="http://www.technologyreview.com/business/38722/?mod=chfeatured"&gt;source&lt;/a&gt;] nicely sums up what I've been hearing about cloud operations from dozens of clients, partners, and other colleagues around the country. The cloud is great for development, prototyping, and special projects for enterprises, but don't rely on it for anything serious. As that article says, "For all the unprecedented scalability and convenience of cloud computing, there's one way it falls short: reliability."&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;But the truth is that the tried-and-true models of network operations aren't all that reliable themselves, and neither are the disaster recovery systems that are supposed to protect them. Granted, they are probably more reliable than the cloud at this point, but downtime is downtime whether it's in the cloud or in a colocation facility. The effect is the same. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;What is really needed is outage resistance that is built into network operations, whatever the model&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;b&gt;Why downtime happens&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I recently read an interesting whitepaper from Emerson Network Power [&lt;a href="http://whitepapers.datacenterknowledge.com/content14060"&gt;source&lt;/a&gt;] that describes the seven most common causes of downtime as revealed by a 2010 survey by the Ponemon institute (&lt;a href="http://www.ponemon.org/index.php"&gt;http://www.ponemon.org/index.php&lt;/a&gt;). The causes are all pretty mundane: UPS problems such as battery failure or exceeded capacity, power distribution unit and circuit breaker failures, cooling problems, human error, and similar things. All of them apply to any data center, whether in-house or in the cloud. None of the exciting stuff like fires, terrorism, or hurricanes made it into the top seven, though of course they could lead to a failure of a battery, circuit breaker, or cooling unit.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Emerson whitepaper describes best practices that can reduce the likelihood of downtime induced by each of the top seven causes. That is all well and good, but some are very costly, such as remodeling server rooms "to optimize air flow within the data center by adopting a cold-aisle containment strategy." Other recommendations include regular and frequent inspection and testing of backup batteries, installation of circuit breaker monitoring systems, and increased training for staff. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;These are good ideas but costly, if not in capital for server room reconfiguration then in staff hours and other recurring costs. The paper contends that problems caused by human error are "wholly preventable" but I believe this is a mistake. No matter how stringent the rules or how well-documented the procedures, someone will take short cuts, overlook a vital step in the midst of a crisis, or sneak their donut and coffee into the control room. Applications fail under stress, databases fail to restart properly, and any number of other things can and do go wrong. There is no way to write contingencies for each, particularly when the initial failure leads to an unpredictable cascade effect.&lt;/span&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;b&gt;And what of the cloud?&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I believe the cloud brings tremendous value to developers, SMBs, and other institutions that need low cost and great flexibility. Where else can an online store launch with a configuration that is not only affordable but also ready for both super-slow sales and a drastic ramp-up if sales shoot into the stratosphere? But like most “better, cheaper, faster” initiatives, the cloud has genuine reliability problems. A company running their own data center could choose to incur the expense and work of instituting all of Emerson's best practices since they are in control of the environment. But all they have from their cloud provider (or colocation provider for that matter) is their Service Level Agreement (SLA). They can't go in themselves and swap out aged batteries or fire the guy who persists in smuggling cinnamon rolls into the NOC.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Technology Review article tells us that some companies are looking for ways to make their cloud deployments far more disaster resistant to start with, rather than just relying on their cloud provider's promises [&lt;a href="http://www.technologyreview.com/business/38722/?mod=chfeatured"&gt;source&lt;/a&gt;]. Seattle-based software developer BigDoor experienced service interruptions as a result of the Amazon cloud's big outage in April 2011. Co-founder Jeff Malek said "For me, [service agreements] are created by bureaucrats and lawyers… What I care about is how dependable the cloud service is, and what a provider has done to prepare for outages" [&lt;a href="http://www.technologyreview.com/business/38722/?mod=chfeatured"&gt;source&lt;/a&gt;].&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The same article describes the Amazon SLA and its implications: &lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Even though outages put businesses at immense risk, public cloud providers still don't offer ironclad guarantees. In its so-called "service-level agreement," Amazon says that if its services are unavailable for more than 0.05 percent of a year (around four hours) it will give the clients a credit "equal to 10% of their bill." Some in the industry believe public clouds like Amazon should aim for 99.999 percent availability, or downtime of only around five minutes a year.&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;b&gt;The outage resistant cloud&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;ZeroNines can give you that 99.999% (five nines) or better, whether you are running a cloud or just running in the cloud. Cloud service providers could install an Always Available™ configuration on their publicly-offered services, providing a highly competitive edge when attracting new customers.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Individual businesses could install an Always Available array on their own networks, synchronizing any combination of cloud deployments, colocation, and in-house network nodes. It also facilitates cloud migration, because you can deploy to the cloud while keeping your existing network up and running as it always has. There is no monumental cloud migration that could take the whole network down and leave the business stranded if there's a glitch in starting an application. Instead, Always Available runs all servers hot and all applications active, enabling entire nodes to fall in and out of the configuration as needed without affecting service. The remaining nodes can update a new or re-started node once it rejoins the system.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;ZeroNines client ZenVault Medical (&lt;a href="http://www.zenvault.com/medical"&gt;www.zenvault.com/medical&lt;/a&gt;) developed and launched their live site in the cloud using an Always Available configuration. Since the day of its launch in September 2010 it has run in the cloud with true 100% uptime, with no downtime at all. That includes maintenance and upgrades. When a problem or maintenance cycle requires a node to be taken offline, ZenVault staffers remove it from the configuration, modify it as necessary, and seamlessly add it back to into the mix once it is ready. ZenVault users don't experience any interruptions.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Visit the &lt;a href="http://zeronines.com/"&gt;ZeroNines.com&lt;/a&gt; website to find out more about how our disaster-proof architecture can protect businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-2044652649307813157?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/2044652649307813157/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2011/10/building-outage-resistance-into-network.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2044652649307813157'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2044652649307813157'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2011/10/building-outage-resistance-into-network.html' title='Building Outage Resistance into Network Operations'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-4739786264536477804</id><published>2011-10-13T12:52:00.000-07:00</published><updated>2011-10-13T13:32:37.836-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='RIM'/><category scheme='http://www.blogger.com/atom/ns#' term='failover'/><category scheme='http://www.blogger.com/atom/ns#' term='blackberry'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>What Did One BlackBerry User Say to the Other BlackBerry User?</title><content type='html'>&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Nothing, according to twitter user @giselewaymes (&lt;a href="http://www.cnn.com/2011/10/12/tech/mobile/blackberry-outage/index.html"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;span style="font-size: 9pt;"&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 9pt;"&gt;.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;In what has to be every large enterprise IT manager's worst nightmare, a big high profile outage grew into a monster, expanded to global proportions, made headlines everywhere, and after three days seemed to have no end in sight. The cause was a failed failover that could have been avoided.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Background: RIM BlackBerry&lt;/span&gt;&lt;/b&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;BlackBerry is produced by Canadian firm Research In Motion (RIM). It is one of the leading smart phones among business users. Its real forte is encrypted mobile email and instant messaging. BlackBerry has about 70 million users worldwide (&lt;a href="http://www.cnn.com/2011/10/12/tech/mobile/blackberry-outage/index.html"&gt;source&lt;/a&gt;). Several high-profile outages and many smaller ones have tarnished its reputation, and this week's seems to be pushing the company to the breaking point if all the buzz on the Internet is to be believed.&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Problem: Failed failover&lt;/span&gt;&lt;/b&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;On Monday morning October 10 2011, millions of BlackBerry users in Europe, the Middle east, and Africa lost access to messenger, email, and Internet. The outage spread to every continent and may eventually have effected half of all BlackBerry users (&lt;a href="http://www.politico.com/news/stories/1011/65774.html"&gt;source&lt;/a&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;RIM explained things to some degree on their website on Tuesday October 11: "The messaging and browsing delays that some of you are still experiencing were caused by a core switch failure within RIM’s infrastructure. Although the system is designed to failover to a back-up switch, the failover did not function as previously tested (&lt;a href="http://www.rim.com/newsroom/service-update.shtml"&gt;source&lt;/a&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;In other words, their failover-based disaster recovery system failed. It can be inferred that this led to cascading failures that knocked out other systems in other regions, leading to this worldwide problem. As of Wednesday evening the 12th it was still not fully resolved, with an interesting update posted on their site outlining the status in various parts of the world (&lt;a href="http://www.rim.com/newsroom/service-update.shtml"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;. By Thursday morning it looked like things were finally under control, with service almost back to normal in most areas.&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Cost: Paid compensation and a blow to the business&lt;/span&gt;&lt;/b&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I don't doubt that RIM will compensate users in one way or another, perhaps in the form of free service (which seems to be the industry's de-facto compensation currency). RIM Co-CEO Jim Balsillie said that such a step would be considered but that their immediate focus was fixing the problem (&lt;a href="http://www.moneycontrol.com/news/wire-news/blackberry-co-ceos-seek-to-controldamage_598978.html"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;More damaging is the additional blow to RIM's reputation. Lots of users are claiming on Facebook, Twitter, and other online forums that this is the last straw and that they will quit BlackBerry. For many this may be a hollow threat but there is genuine peril here. "This outage… comes at a particularly bad time for RIM, since it faces increasing competition in the smarpthone market… Apple's iPhone and phones on the Google Android operating system have been gaining ground, and the new iPhone 4S goes on sale Friday (October 14)" (&lt;a href="http://www.cnn.com/2011/10/12/tech/mobile/blackberry-outage/index.html"&gt;source&lt;/a&gt;).&lt;/span&gt; &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The cost can be high outside of RIM as well. "The outage caught much of D.C. off guard Wednesday and underscored the region’s reliance on the BlackBerry — which is still the only federally approved smartphone for employees in some government agencies (&lt;a href="http://www.politico.com/news/stories/1011/65774.html#ixzz1agZi4YXJ"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;span style="font-family: &amp;quot;Calibri&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 11pt;"&gt;.&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;As for RIM itself, back in June there was a flurry of articles suggesting RIM was potentially facing bankruptcy (&lt;a href="http://www.marketwatch.com/story/is-rim-the-next-nortel-networks-2011-06-23"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;. And this week there have been a number of stories about growing momentum for a RIM breakup or merger (&lt;a href="http://dealbook.nytimes.com/2011/10/11/investor-says-momentum-builds-for-blackberry-break-up/"&gt;source&lt;/a&gt;). Even a massive outage like this is unlikely to cause the demise of a large and important firm, but combined with other woes like a less-than-competitive product and poor business model it could well be the deciding factor.&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: &amp;quot;Calibri&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 11pt;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Solution: Eliminate failover systems&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;RIM is in trouble for a number of reasons but downtime like this does not need to be one of them. I contend that the core problem was not a failed switch but a failed failover. Switches will fail and there is no avoiding that. If you can architect the perfect switch, I invite you to do so and you'll be richer than Bill Gates. &amp;nbsp;It's what happens &lt;u&gt;after&lt;/u&gt; the inevitable switch malfunction (or other disaster) that matters most. Failover systems will fail too. RIM's apparently worked fine during a test but the strain and chaos of a real-world crisis was too much for it. At ZeroNines, we propose eliminating the failover systems in favor of something that will turn failures into virtual non-events. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;ZeroNines' Always Available™ technology eliminates the need for failover, processing the same applications and data simultaneously on multiple servers, clouds, and virtual servers separated by thousands of miles. All servers are hot, and all applications are active. So if a switch fails in one network instance there is no need for a risky failover to another. Other instances are already processing the same transactions in parallel and simply continue processing as if nothing had happened. Once the problem with the switch is rectified, that instance is brought back into the Always Available array, is automatically updated, and resumes processing along with the others.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Numbers&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;RIM says that its service "has been operational for 99.7% of the time over the last 18 months" (&lt;a href="http://www.techradar.com/news/phone-and-communications/mobile-phones/rim-weighing-up-customer-compensation-for-blackberry-outage-1033814"&gt;source&lt;/a&gt;)&lt;/span&gt;&lt;a href="http://www.techradar.com/news/phone-and-communications/mobile-phones/rim-weighing-up-customer-compensation-for-blackberry-outage-1033814"&gt;&lt;/a&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;. That equates to about 1,576.8 minutes of downtime, or 26.28 hours per year.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;A good industry standard for uptime is 99.9% or three nines. That is 525.6 minutes of downtime, or 8.76 hours per year. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;ZeroNines can provide in excess of five nines of uptime, or 99.999%. That is less than 5.3 &lt;u&gt;minutes&lt;/u&gt; of downtime per year. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I do not know if planned downtime was included in RIM's 99.7% calculation. Companies often do not include planned downtime in their business continuity projections, counting only unplanned outages. But downtime is downtime from a user's perspective, whether caused by an accident or a planned maintenance cycle. ZeroNines protects against both.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;In the last 12 months since ZenVault Medical went live on an Always Available cloud-based architecture it has experienced true 100% uptime, with no downtime whatsoever for any reason. That includes planned maintenance, upgrades, and other events that would have taken an ordinary network offline.&amp;nbsp; &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;Visit the &lt;a href="http://www.zeronines.com/"&gt;ZeroNines.com&lt;/a&gt; website to find out more about how our disaster-proof architecture can protect businesses&amp;nbsp;of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-4739786264536477804?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/4739786264536477804/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2011/10/what-did-one-blackberry-user-say-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4739786264536477804'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4739786264536477804'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2011/10/what-did-one-blackberry-user-say-to.html' title='What Did One BlackBerry User Say to the Other BlackBerry User?'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-8645133377190429107</id><published>2011-08-10T16:28:00.000-07:00</published><updated>2011-08-25T16:39:44.404-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Foursquare'/><category scheme='http://www.blogger.com/atom/ns#' term='Reddit'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='Netflix'/><category scheme='http://www.blogger.com/atom/ns#' term='EC2'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='Instagram'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Amazon EC2 Outage: Déjà Vu All Over Again</title><content type='html'>&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;It seems we can always rely on cloud outages to spice up the news feeds. Today, it's another Amazon EC2 Cloud outage, which is a nice departure from the wildly gyrating stock market and the U.S. debt downgrade. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I didn't write about Amazon's big April 2011 EC2 outage simply because I was overwhelmed with other work (along with texts, tweets and emails about the outage). That outage affected big-name customers like Netflix, Foursquare, HootSuite, and Reddit (&lt;a href="http://www.crn.com/news/cloud/229402004/amazon-ec2-goes-dark-in-morning-cloud-outage.htm"&gt;source&lt;/a&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;). Some EC2 customers' websites were down for as much as two days.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Then just this past weekend an electrical storm over Dublin Ireland led to a lightning strike on a transformer and a subsequent explosion, fire, and loss of power at an Amazon data center. Backup generators could not be started. Amazon's European EC2 Service was affected for as long as twelve hours. Some Microsoft cloud services were knocked out as well (&lt;a href="http://www.pcmag.com/article2/0,2817,2390664,00.asp"&gt;source&lt;/a&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I am a huge proponent of the cloud; however, I believe reliability can and should improve. As a frequent speaker and panelist at cloud-related events, I find that many in the audience are not convinced that the cloud is reliable enough to meet the needs of mission-critical applications. Outages like this don’t help. However, I am aware of several successful implementations of robust, outage-resistant cloud deployments that simply have not gotten any attention because the clients are not motivated to share how they did it with their competitors. Some of these early adopters took risks and made large investments when the mainstream would not, and they feel they deserve some advantage while they can get it. Naturally enough I think ZeroNines has the right solution, but read on for now.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Background: Amazon as a major cloud provider&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;Amazon EC2 is the Amazon Elastic Compute Cloud (&lt;a href="http://aws.amazon.com/ec2/"&gt;source&lt;/a&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;). It provides thousands of online service providers and software developers easy access to cloud computing capacity that is variable in size. Customers pay only for what they use. Their customers include Netflix (streaming movies and TV shows), Instagram (photo sharing), Reddit (social networking for sharing news), and Foursquare (location-based social networking). &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Problem: Something's rotten in the state of Virginia&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;I have not found a clear statement yet that describes the exact cause of the August 8 outage, but PCMag.com says that it "closely mirrors a similar cloud outage Amazon suffered in April" (&lt;a href="http://www.crn.com/news/cloud/231300459/new-amazon-cloud-outage-takes-down-netflix-foursquare.htm?itc=refresh"&gt;source&lt;/a&gt;&lt;/span&gt;&lt;a href="http://www.crn.com/news/cloud/231300459/new-amazon-cloud-outage-takes-down-netflix-foursquare.htm?itc=refresh"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;). It also happened in the same Virginia data center. The April 2011 outage "happened after Amazon network traffic was 'executed incorrectly.' Instead of shifting to another router, traffic went to a lower-capacity network, taking down servers in Northern Virginia." (&lt;a href="http://www.pcmag.com/article2/0,2817,2390702,00.asp"&gt;source&lt;/a&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;). So Amazon loses points for allowing the same problem to happen twice in the same place, but wins a few back for apparently being ready this time and containing the August 8 outage to minutes rather than days.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Cost: Revenue and reputation&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;As always with these outages there is talk of the provider compensating its customers through waived fees and such. Mark that against Amazon's balance sheet. Customers no doubt lost business, and you can mark that against &lt;i style="mso-bidi-font-style: normal;"&gt;their&lt;/i&gt; balance sheets. Reliability issues will chase away customers who don't want to risk their own revenue with a service notorious for crashing. But if the cloud nonetheless offers the best business model, what do these customers do? Press for lower fees and more favorable service level agreements for one.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The Solution: Prevention, not recovery&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;If you're an actual or potential cloud user (with any provider), Always Available™ from ZeroNines can protect your existing systems without changing providers, hardware, operating systems, or applications. If there's a disaster in any part of your system, all your networked transactions and applications continue functioning as normal on the other network nodes. Our CloudNines™ application can protect your cloud-based infrastructure, VirtualNines™ can protect virtualized environments on your own machines, and EnterpriseNines™ can add Always Available protection to any other network infrastructure. You can mix and match so all these can interoperate seamlessly. For businesses of any size, the result is uptime of virtually 100% regardless of the disasters that may strike any individual node in the Always Available array.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt;The cloud providers themselves could use the same CloudNines product to protect their systems, virtually eliminating downtime and avoiding headlines like Amazon's. We are currently developing and monitoring on Amazon and other cloud platforms. Our technology is certified for Windows Server&lt;sup&gt;®&lt;/sup&gt; 2008, compatible with Windows Server&lt;sup&gt;®&lt;/sup&gt; 2008 Hyper-V™ and Hyper-V™ Server, and certified as VMWare&lt;sup&gt;®&lt;/sup&gt; ready.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Visit the &lt;a href="http://www.zeronines.com/"&gt;ZeroNines.com&lt;/a&gt; website to find out more about how our disaster-proof architecture can protect businesses&amp;nbsp;of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Alan Gin&lt;/b&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/i&gt;&lt;span style="font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-8645133377190429107?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/8645133377190429107/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2011/08/amazon-ec2-outage-deja-vu-all-over.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8645133377190429107'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8645133377190429107'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2011/08/amazon-ec2-outage-deja-vu-all-over.html' title='Amazon EC2 Outage: Déjà Vu All Over Again'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-7441823697430483814</id><published>2010-10-11T10:06:00.000-07:00</published><updated>2010-10-14T14:31:55.486-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='health'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='PHR'/><category scheme='http://www.blogger.com/atom/ns#' term='EMR'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='privacy'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='medical'/><title type='text'>Announcing ZenVault Medical: Your Cloud-Based, Secure, Encrypted Personal Health Record</title><content type='html'>I had a heart attack back in 2008. I was lucky. My local emergency room facility and the intensive care unit hospital that I was transferred to happened to share my medical records in electronic format. But only about 10% of U.S. hospitals use electronic records so if this had happened away from home I probably would have died because no other doctor or hospital would have known about my pre-existing medical conditions.&lt;br /&gt;&lt;br /&gt;It was suddenly very easy for me to see the need for a system that would allow consumers to take their medical records with them wherever they go. Not only for emergencies but for everyday reference. Some quick Googling revealed Personal Health Record (PHR) solutions from Microsoft (HealthVault), Google (Google Health) and a large number of others, but consumer adoption was low. I also discovered that the Electronic Medical Records (EMRs) used by hospitals and doctors were no solution because they are inaccessible to consumers and practitioners outside the system.&lt;br /&gt;&lt;br /&gt;I enlisted the help of my personal doctors, friends and classmates who work in the healthcare field as well as other technologists who are consulting to large medical organizations around the country. All told, we have consulted with 36 experts who freely gave us their opinions about the issues surrounding EMRs and how a comprehensive PHR should be designed in order to deliver high value to consumers while potentially saving lives. I summarize the issues in BOLD and describe how we address them.&lt;br /&gt;&lt;br /&gt;So today we at ZeroNines introduced ZenVault Medical (&lt;a href="http://www.zenvault.com/medical"&gt;www.zenvault.com/medical&lt;/a&gt;), a Cloud-based, private, encrypted, online PHR for consumers that you can access through a computer or mobile device. In addition to helping people with their medical care, it’s a great example of how the Cloud and other cutting-edge technologies can come together to create a unique and valuable consumer product.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: The Need for Digital Medical Records&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;If you’re like most people, your medical records are scattered among a number of doctors and they are hard to get to. The Obama administration wants the country to &lt;a href="http://health.usnews.com/health-news/blogs/heart-to-heart/2009/2/17/electronic-medical-records-will-your-privacy-be-safe.html"&gt;convert to Electronic Medical Records&lt;/a&gt;. The goal is to improve healthcare and cut costs by making an individual’s collection of medical records available electronically at any hospital or doctor’s office, cutting down on paper volume, saving time, and increasing accessibility particularly in emergencies. This truly needs to happen – my own experience proves that – but the issue is how. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: Security, Privacy, and Reliability&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;Questions surrounding security and privacy make many &lt;a href="http://health.usnews.com/health-news/blogs/heart-to-heart/2009/2/17/electronic-medical-records-will-your-privacy-be-safe.html"&gt;citizens and consumer advocates reluctant to jump on board&lt;/a&gt;. Will such a system be run by the government or by business? Who will have access? Will sensitive personal information about illnesses, prescriptions, and treatments be turned over to insurance companies? To marketers? To employers? Can any body of law successfully regulate how such highly personal information is handled and protected, enabling it to benefit the individual yet keeping it out of the hands of those who would profit by violating privacy? Is it even the government’s place to get involved with personal medical records? And what technology is secure enough to handle all this?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Security:&lt;/strong&gt; Any medical records system needs to keep hackers at bay. Well-publicized data breaches with Microsoft and Google call into question their ability to protect medical privacy. Frankly, I decided to subscribe to one of these systems before we came up with ZenVault, but was concerned with who might be accessing my records and selling it to insurance companies and marketing firms.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Privacy:&lt;/strong&gt; Many companies offering free digital medical records turn around and &lt;a href="http://www.ktvu.com/news/24278317/detail.html"&gt;sell customer data to pharmaceutical and insurance companies&lt;/a&gt;. And a September 16 2010 &lt;a href="http://online.wsj.com/article/SB10001424052748704285104575492440245394392.html?mod=WSJ_Tech_LEFTTopNews"&gt;article in the Wall Street Journal&lt;/a&gt; described a data breach wherein a Google engineer broke the company’s privacy policies by accessing private customer information.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Reliability:&lt;/strong&gt;If anything needs 100% uptime, it’s medical applications. Take a look at some of the high-profile downtime events discussed in the rest of this blog and then imagine the cost in lives and well-being if they had affected hospital emergency rooms. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Customer Control of a Safe, Secure, and Always Available™ Personal Health Record&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Simply putting control of the health record in the hands of the individual consumer or patient addresses the bulk of these concerns. If no one can read the record but the customer, that’s most of the battle won. So what is the difference between ZenVault Medical and other consumer-facing PHRs like Google Health and HealthVault?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Security:&lt;/strong&gt; ZenVault encrypts stored records with a patent-pending variant of the NSA-approved encryption protocols that protect top-secret information. ZenVault does not employ a “key ring” that stores customer encryption keys which means there is no copy available for anyone to find and rummage through your data. The customer creates his or her own unique encryption key so only they can access and edit their private medical records. SSL-secured sessions protect data in transit from computers, smartphones, and tablets. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Privacy:&lt;/strong&gt; ZenVault never shares information. Period. We don’t sell it, rent it, or give it away, not even in a “sanitized” format like some admit to doing. We charge consumers for our service and our business model is based on customer trust. If they don’t trust us we lose. In fact, our encryption system prevents even our own engineers and administrators from reading patient data, so we couldn’t sell it even if we wanted to. How’s that for a guarantee?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Reliability:&lt;/strong&gt; ZenVault uses ZeroNines' Always Available™ technology designed to protect the world's most sensitive financial and military computer systems. There is virtually no "downtime" or data loss with ZenVault. A Cloud-based infrastructure helps keep costs down, ensures scalability, and supports universal accessibility. Use of Always Available allays any concerns over Cloud reliability. In fact, we intend to use ZenVault as an example of a highly reliable, high-usage application deployed in the Cloud. Read more about Always Available on the ZeroNines.com website &lt;a href="http://www.zeronines.com/site/solution/products.shtml"&gt;ZeroNines.com website&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Convenience:&lt;/strong&gt; Users can update or read their records anywhere they have Internet access. They can send their records to any doctor with just a few clicks using a secure message system. Have you ever wasted time at a doctor appointment filling out a clipboard full of medical history forms? Use ZenVault to send them your PHR instead! Doctors can send patients their records, lab results, and x-rays with equal ease.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Affordable: &lt;/strong&gt;A free account is available, offering a basic PHR with full security, encryption, and privacy protection. A premium account adds advanced features for a small monthly charge.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Secure Emergency Room Access:&lt;/strong&gt; ZenVault offers emergency rooms their own accounts with their own special encryption keys. They get controlled access to six key fields in a patient’s record such as history of heart disease, drug sensitivities, and emergency contact information. This gives them the basic information they need to save a life and contact loved ones yet protects the majority of personal information until the patient or their family elects to release it.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Take Your Personal Health Record with You&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;If you have Internet access, you can use ZenVault. I hope none of you ever has a medical emergency like the one that sent me to the hospital two years ago. But if you do, ZenVault could save your life by putting the needed information in the right place, at the right time. I have no doubt that one day a universal health record database will be a reality, but until then you can have all the benefits while keeping control yourself. Try it out and let me know what you think:&amp;nbsp;&amp;nbsp;&lt;a href="http://www.zenvault.com/medical"&gt;www.zenvault.com/medical&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;Visit the &lt;a href="http://www.zeronines.com/"&gt;ZeroNines.com&lt;/a&gt; website to find out more about how our disaster-proof architecture can protect businesses&amp;nbsp;of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-7441823697430483814?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/7441823697430483814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2010/10/announcing-zenvault-medical-your-cloud.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7441823697430483814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7441823697430483814'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2010/10/announcing-zenvault-medical-your-cloud.html' title='Announcing ZenVault Medical: Your Cloud-Based, Secure, Encrypted Personal Health Record'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-2986074190256478368</id><published>2010-05-24T10:39:00.000-07:00</published><updated>2010-05-27T10:52:44.008-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='financial'/><category scheme='http://www.blogger.com/atom/ns#' term='TD AMERITRADE'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><title type='text'>TD AMERITRADE Outage and How Failover Fails Finance</title><content type='html'>Online brokerage TD AMERITRADE was offline for 80 minutes on Thursday May 20, 2010 [&lt;a href="http://www.marketwatch.com/story/td-ameritrade-fixes-web-site-access-problem-2010-05-20"&gt;source&lt;/a&gt;]. Because of the outage, some of their clients could not log in to their accounts to place trades during the powerful market downdraft that occurred that day [&lt;a href="http://news.yahoo.com/s/ap/20100520/ap_on_bi_ge/us_td_ameritrade_outage"&gt;source&lt;/a&gt;]. Outages among financial firms have gotten a lot of coverage in the last couple years, no doubt because of the universally amped-up sensitivity to any kind of news with the word “financial” attached to it. Here’s a brief look at this outage, and some commentary on outages in general among financial companies.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: About TD AMERITRADE&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Online discount broker TD AMERITRADE has millions of U.S. customers (Wikipedia reports over six million), and many more internationally. The company has grown rapidly through acquisition and was the 746th-largest US firm in 2008 [&lt;a href="http://en.wikipedia.org/wiki/TD_Ameritrade"&gt;source&lt;/a&gt;]. It acquired thinkorswim Group, Inc., another popular online brokerage, in January 2009. Lots of average Americans use TD AMERITRADE to generate income and manage retirement accounts. I use them myself and really like their system, but did not notice the outage because I was doing other things at the time.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: An Outage of Some Kind&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;At about 11:40 AM Eastern time, clients found that they could not log on to the TD AMERITRADE retail website. The outage ended at about 1:00 PM. No disruption was reported on their mobile site or at their subsidiary thinkorswim [&lt;a href="http://www.marketwatch.com/story/td-ameritrade-fixes-web-site-access-problem-2010-05-20"&gt;source&lt;/a&gt;]. Clients already logged in experienced no trouble, urging one writer to speculate that it was a web authorization issue of some kind. [&lt;a href="http://www.brokernewsblog.com/index.php/20100521355/Broker-News/News/Just-How-Extensive-Was-TD-Ameritrade-s-Outage-AMTD-SCHW-ETFC.html"&gt;source&lt;/a&gt;]. If TD AMERITRADE has made a formal announcement of the cause, a half hour of Googling on my part failed to find it.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Was This a Failed Failover?&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Posted on the TD AMERITRADE site [&lt;a href="http://www.tdameritrade.com/aboutus.html"&gt;source&lt;/a&gt;] is the “TD AMERITRADE Business Continuity Plan Statement” [&lt;a href="http://www.tdameritrade.com/forms/AMTD5491.pdf"&gt;source&lt;/a&gt;]. One of the statements in this brief public document reads “Disruption of service at any of our service centers will result in calls, orders and electronic communications being re-routed to an alternative service center located in a different region of the country with a separate power grid and transportation system.” &lt;br /&gt;&lt;br /&gt;Let me state clearly that I am entering the realm of speculation here. The statement quoted above implies that TD AMERITRADE is relying on a business continuity plan based on failover architecture. Failover or cutover has been the de-facto choice for business continuity and until recently it has been the only real game in town. But it is by nature unreliable and even the best systems are subject to downtime. If their backup plan is indeed based on failover, then failover obviously failed them.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Cost: As Always, it’s the Intangibles&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;As in so many outages of this kind, the real costs are difficult to estimate. Easiest to ponder are the lost commissions from trades that could not occur during an extremely busy trading day. Less tangible are the effects on reputation and customer satisfaction. No one wants a broker that is unavailable when they need them most. One customer claimed to have lost about $2,000 from being unable to log in [&lt;a href="http://news.yahoo.com/s/ap/20100520/ap_on_bi_ge/us_td_ameritrade_outage"&gt;source&lt;/a&gt;]. TD AMERITRADE stock fell about 3.7% that day but this may not mean much because markets overall were down about 3%. &lt;br /&gt;&lt;br /&gt;According to a May 2007 article from Financial Services Technology, a study from the Meta Group revealed that “the cost per hour for downtime – ranging from simple network outages to major emergencies – in the financial services sector is, on average, $1.4 million” [&lt;a href="http://www.usfst.com/article/Business-continuity-planning-the-human-side/"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;An Ugly Thought: Downtime among High Frequency Traders&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;For many, the cost will be far higher. Some banks, hedge funds, and other high-power financial firms engaged in High Frequency Trading (HFT) make billions of trades a day over ultra-high speed connections [&lt;a href="http://www.nytimes.com/2009/07/24/business/24trading.html"&gt;source&lt;/a&gt;]. Many trades live for only a few seconds. Enormous transactions are conceived and executed in half a second, with computers evaluating the latest news and acting on it well before human traders even know what the news is. HFT is having a significant effect on markets; there is evidence that the history-making “Flash Crash” of May 6 2010 was caused and then largely corrected by High Frequency Trading [&lt;a href="http://personalmoneystore.com/moneyblog/2010/05/17/tradeworx-hedge-fund/"&gt;source&lt;/a&gt;]. What would happen if one of these HFT systems was down for an hour and a half? Or even just a minute? Whatever your stance on the ethics of HFT, I think it fair to say that those engaged in it need to avoid downtime at all costs.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Failover Can’t Handle It&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Even a successful failover event may cause some glitches and lost trades among the average retail trading populace. But if a High Frequency Trading system experiences such a glitch, billions of dollars could be lost in the blink of an eye. The trades themselves may fail, and by the time the system comes back up the conditions that made those trades possible are a thing of the past. And that’s for a successful failover. A failed failover can leave businesses out of the race for minutes, hours, and even days.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Alternative: Active/Active Architecture&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;High profile financial systems clearly need something better than failover. The typical outage is caused by failures of server hardware, server software, upgrades, maintenance, and sometimes more dramatic stuff like fires and floods. The best protection in these cases is to eliminate failover entirely, and switch to an “active/active” or “hot/hot” architecture that eliminates the chance of a failed cutover and the resultant downtime. Always Available™ business continuity architecture from ZeroNines is one such system. Always Available processes all network transactions continually, simultaneously, and equally in multiple locations on multiple servers, all of which are hot and all of which are active. Always Available can offer virtually 100% uptime, because instead of relying on failover Always Available simply continues running the same apps and data at two or three additional locations, with no interruption to the user. So if a web server or database goes down somewhere, the other nodes of the system continue processing without missing a beat. Visit the &lt;a href="http://www.zeronines.com/"&gt;ZeroNines website&lt;/a&gt; to find out more.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-2986074190256478368?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/2986074190256478368/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2010/05/td-ameritrade-outage-and-how-failover.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2986074190256478368'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2986074190256478368'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2010/05/td-ameritrade-outage-and-how-failover.html' title='TD AMERITRADE Outage and How Failover Fails Finance'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-7220276428901611511</id><published>2010-01-22T11:08:00.000-08:00</published><updated>2010-01-23T11:13:20.864-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Haiti'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='Twitter'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Twitter Grows Up and then Falls Down</title><content type='html'>Do you Twitter? Or Tweet? Or whatever they call it? Gotta admit, I don’t. So I didn’t really pay much attention when I first heard that Twitter had gone down the other day [&lt;a href="http://www.informationweek.com/news/storage/disaster_recovery/showArticle.jhtml?articleID=222301682"&gt;source&lt;/a&gt;]. Life goes on. But what a good thing (I thought) that this had not happened a week ago when Twitter took the spotlight on the world stage as it helped gather money for earthquake relief in Haiti. &lt;br /&gt;&lt;br /&gt;That made positive headlines everywhere. But if the outage had occurred during the first critical hours or days of the relief effort, a self-righteous world would instead have sneered at Twitter for having failed, despite the fact that Twitter was never billed as a source of disaster relief. This is a window into an important reality: you’d better plan uptime into your system because you never know when you will be caught in the spotlight.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: Twitter Comes of Age &lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Twitter is an instant messaging system that allows short messages of up to 140 characters to be sent to a subscriber’s contacts, or be made available to the Twitter community at large. Millions of people use Twitter every day. Data reported in Wikipedia [&lt;a href="http://en.wikipedia.org/wiki/Twitter"&gt;source&lt;/a&gt;] shows that over 75% of the messages on Twitter are either “conversational” or “pointless babble.” A small but powerful percentage of messages are for far more serious purposes. Twitter was drafted into service for political campaigning, education, public relations, and emergencies long before the Haiti earthquake. But I see its Haiti relief efforts as the moment it came of age, when Twitter was first used to mobilize money on a mass worldwide scale for a focused, responsible, humanitarian purpose. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: A Failover Failure&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;On the morning of Wednesday January 20, 2010, Twitter became virtually inaccessible. According to Information Week, "A sudden failure coupled with problems in switching to a backup system produced a high number of errors for around 90 minutes" [&lt;a href="http://www.informationweek.com/news/storage/disaster_recovery/showArticle.jhtml?articleID=222301682"&gt;source&lt;/a&gt;]. In other words, an unspecified failure in one place forced the system to rely on its “failover” architecture, which in turn failed. This is a classic failover failure.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Cost: Hard to Quantify but Scary to Contemplate&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Since Twitter service is free, there may be no direct cost to Twitter. Indirectly, this event contributes negatively to Twitter’s overall equation for obtaining venture capital, building a positive public image, and eventually making money off of paid services.&lt;br /&gt;&lt;br /&gt;And here’s where we get into very uncertain territory. What would the cost have been if it had happened just a few days earlier? Would millions of dollars in aid to Haiti have been delayed or failed to materialize? Would people who were saved by this aid have died? Possibly. At the very least, Twitter would have experienced a PR storm far more serious than the January 20 outage caused.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Ditch the Failover&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Failover (also known as cutover) is the de facto recovery solution for dealing with IT disasters, but it contains inherent flaws that often prevent it from working at the very moment it is needed. Vast numbers of companies and other organizations in the U.S. and around the world rely on failover to keep them functioning in the event of their own disasters, be they failed server equipment or regional catastrophes. &lt;br /&gt;&lt;br /&gt;ZeroNines has designed an Always Available™ business continuity architecture that does away with failover entirely. No backup systems ever need to kick in with only microseconds of notice. Instead, processing of all network transactions occurs continually, simultaneously, and equally in multiple locations. Long and short, Always Available prevents disasters like Twitter’s 90 minute outage on Wednesday. Instead of relying on a cutover event to succeed, Always Available simply continues running the same apps and data at one of two or three additional locations, with no interruption to the user. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;A Parting Thought&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Systems like Twitter’s, which rely on failover in the event of disasters, currently form the backbone of business and government information systems. Suppose that earthquake had happened somewhere in the U.S. (which one day it will) and knocked out data centers, communications, and other key infrastructure? If the failover systems fail like they failed Twitter (which they will), then what is the prospect for marshalling aid within our own borders? A scary thing to consider.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website&lt;/a&gt; to find out more about how our disaster-proof architecture can protect businesses (and government agencies) of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-7220276428901611511?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/7220276428901611511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2010/01/twitter-grows-up-and-then-falls-down.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7220276428901611511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7220276428901611511'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2010/01/twitter-grows-up-and-then-falls-down.html' title='Twitter Grows Up and then Falls Down'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-390951587522325410</id><published>2009-12-22T10:52:00.000-08:00</published><updated>2009-12-22T10:56:02.611-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='Stratus Technologies'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='business continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='ITIC'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='uptime survey'/><title type='text'>Most Businesses Don’t Know what Downtime Costs Them</title><content type='html'>I just discovered the results of a survey about the need for application availability among businesses [&lt;a href="http://itic-corp.com/blog/2009/04/application-availability-reliability-and-downtime-ignorance-is-not-bliss/"&gt;source&lt;/a&gt;]. The survey was conducted by ITIC and Stratus Technologies. Results were released in April 2009. It basically sought to find out how much application uptime businesses think they need, and what they intend to do about it.&lt;br /&gt;&lt;br /&gt;The survey found that overall, IT executives are aware that the need has grown for high-availability applications and the infrastructure to support them. But budgets are too low to support them, and most companies do not know what their downtime is costing them. This makes it difficult for these same executives to make a budgetary case for implementing high uptime solutions.&lt;br /&gt;&lt;br /&gt;Downtime is a business killer. As an example, consider that of the 350 companies in the World Trade Center before the 1993 truck bombing, 150 were out of business a year later because of the disruption. [source: Gartner/RagingWire report cited in “Without the wires,” Fabio Campagna, Disaster Recovery Journal, Winter 2002]. &lt;br /&gt;&lt;br /&gt;The big lesson here: there is a significant competitive advantage for investing in uptime.&lt;br /&gt;&lt;br /&gt;Here are some key facts from the survey, and my thoughts about them.&lt;br /&gt;&lt;br /&gt;1) “Two out of five businesses – 40% – report that their major business applications require higher availability rates than they did two or three years ago. However an overwhelming 81% are unable to quantify the cost of downtime and only a small 5% minority of businesses are willing to spend whatever it takes to guarantee the highest levels of application availability 99.99% and above.”&lt;br /&gt;&lt;br /&gt;Clearly, the field is wide open for companies to pull ahead if they go for four or five nines of uptime (or more), particularly those who serve vital and highly regulated sectors such as financial, healthcare, defense, data hosting, and so forth. A company that falls out of compliance with strict regulations like Sarbanes-Oxley or HIPAA can be driven to the brink by fines, the costs of regaining compliance, and lost business.&lt;br /&gt;&lt;br /&gt;2) “The survey results uncovered many “disconnects” between the levels of application reliability that corporate enterprises profess to need and the availability rates their systems and applications actually deliver.” In other words, businesses are not getting the uptime they require, whether it is to meet SLAs or simply conduct everyday business.&lt;br /&gt;&lt;br /&gt;In reality, the uptime that company leaders “profess to need” is probably insufficient. Considering that a downtime event of only a few seconds can cause a cascading failure in applications and databases, they probably need uptime of practically 100% in order to avoid a bigger disaster. Once that first domino falls, maybe you can grab it and stand it back up but all the rest are already falling. The damage is done.&lt;br /&gt;&lt;br /&gt;3) “Some 41% said they would be satisfied with conventional 99% to 99.9% (the equivalent of two or three nines) availability for their most critical applications.”&lt;br /&gt;&lt;br /&gt;I can’t imagine a company being without its “most critical application” for between 8+ hours (for 99% uptime) and four full days (99.9%). Companies have gone out of business after downtime of less than that. I can’t help but believe that the executives who answered this question like that are somehow out of touch with the realities of their environment. Maybe they are in industries where expectations are really low. But can you think of a bank or stock brokerage or hospital where one- or two-day outages a couple times a year are the norm? I can’t. And that is probably because such companies cease to exist.&lt;br /&gt;&lt;br /&gt;Contrast that with this: “An overwhelming 81% of survey respondents said the number of applications that demand high availability has increased in the past two-to-three years.” High availability is typically considered to be four nines (99.99% availability and above) or less than 53 minutes of downtime per year. Yet 41% of respondents say they would be satisfied with only two or three nines? Astounding.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Disaster of Disaster Recovery&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;IT executives typically prepare for downtime by implementing some variation of the backup/failover paradigm, even though most are aware it is unlikely to work. I invite you to read the ZeroNines whitepaper “The Disaster of Disaster Recovery” (available on the &lt;a href="http://www.zeronines.com/site/solution/whitepapers.shtml"&gt;ZeroNines.com website&lt;/a&gt;) which looks at the causes of downtime and explores the shortcomings of the predominant failover disaster recovery technique. It also discusses the ZeroNines alternative, which can bring uptime beyond any measure of “nines” to virtually 100%.&lt;br /&gt;&lt;br /&gt;ZeroNines Technology, Inc. is not affiliated with ITIC, the Information Technology Intelligence Corp. or with Stratus Technologies.&lt;br /&gt;&lt;br /&gt;Visit the&amp;nbsp;&lt;a href="http://www.zeronines.com/"&gt;ZeroNines.com website&lt;/a&gt;&amp;nbsp;to find out more about how our disaster-proof architecture can protect businesses (and government agencies) of any description from downtime.&lt;br /&gt;&lt;br /&gt;Alan Gin &lt;em&gt;– Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-390951587522325410?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/390951587522325410/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/12/most-businesses-dont-know-what-downtime.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/390951587522325410'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/390951587522325410'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/12/most-businesses-dont-know-what-downtime.html' title='Most Businesses Don’t Know what Downtime Costs Them'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-1035701724897832212</id><published>2009-11-23T17:40:00.000-08:00</published><updated>2009-12-02T17:50:12.916-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='FAA airline Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>Fixing The FAA’s Single Point of Failure</title><content type='html'>“The difficulties started when a single circuit board in a piece of networking equipment at a computer center in Salt Lake City failed around 5 a.m…” [&lt;a href="http://www.foxnews.com/story/0,2933,575707,00.html"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;All too often it seems that the biggest problems are caused by the smallest failures. This blog is full of posts about how generator transfer switches, router programming changes, and problematic network hardware can bring businesses to their knees. Now a single circuit board failure causes havoc among airlines, airports, and air travelers.&lt;br /&gt;&lt;br /&gt;I stand by my earlier assertion: Trying to guarantee application and data uptime by eliminating all possible sources of failure is not possible. The more complex a system gets, the more likely some part is going to fail, and it is impossible to identify them all. But there is a way to prevent these little disasters from becoming big ones.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: The Flight Plan Management System&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The failed FAA computer system was the National Airspace Data Interchange Network [&lt;a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/11/19/AR2009111901081.html"&gt;source&lt;/a&gt;], which manages flight plans and ground traffic. This is one of two nationwide computer centers that collects flight plans. The other is in Atlanta. This was the third time since June 2007 the system has failed [&lt;a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/11/19/AR2009111901081.html"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: Hardware Failure Blocks Access&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;When the circuit board failed on November 19, 2009, access to data and communications was blocked, making flight plans filed by airlines inaccessible [&lt;a href="http://www.foxnews.com/story/0,2933,575707,00.html"&gt;source&lt;/a&gt;]. Air traffic controllers had to enter flight plans manually in several parts of the U.S. The problem was fixed about five hours later.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Cost: Mostly to the Airlines&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The FAA being a governmental agency, no direct fiscal impact can be readily estimated. However, the cost to airlines has to be considerable, since many flights were canceled or delayed. Airline stocks were down that day – whether the computer failure was the cause or not – and our poor beleaguered airlines can’t help but suffer when something like this happens. They were still down even after the problem had been fixed [&lt;a href="http://articles.moneycentral.msn.com/Investing/Dispatch/market-dispatches.aspx?post=1391002"&gt;source&lt;/a&gt;]. And of course individual travelers, such as myself, will bear the brunt too in the form of delays, costlier alternative travel, and unplanned hotel stays. Not to mention missed business meetings which can cost a business a lot more than a replacement airline ticket. The domino effect of airline delays is a disaster unto itself.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Sidestep the Single Point of Failure&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Doug Church, a spokesman for the National Air Traffic Controllers Association, said…"We think it's a single-point failure that occurred somewhere in the system," he said. "One single glitch was able to shut down the entire system." [&lt;a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/11/19/AR2009111901081.html"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;This is perhaps the scariest statement about the whole affair. The simple fact that they went dark shows that their backup systems also failed. This is not surprising; most disaster recovery systems use the “failover” or “cutover” technique which is outdated, unreliable and can lead to cascading failures and increased downtime. Such occurrences are frighteningly common.&lt;br /&gt;&lt;br /&gt;At ZeroNines we propose a different approach. Instead of trying to catch a downtime event with a failover recovery, like a ninja trying to catch an arrow, we simply double up all the processing in multiple data centers around the country or around the world. Each processes the same thing at the same time so if “a single circuit board in a piece of networking equipment at a computer center in Salt Lake City” fails, the additional networking equipment in Atlanta or Omaha or Dusseldorf or wherever keeps on processing.&lt;br /&gt;&lt;br /&gt;The likelihood of application or data downtime – where users lose access to the tools and information they need to do their jobs – drops to virtually zero because the chances of all data centers, or clouds, or virtual environments failing simultaneously is statistically negligible. In this instance, had the National Airspace Data Interchange Network been protected by our Always Available™ technology, the Atlanta network node would simply have continued processing while Salt Lake City was repaired and brought back online. Then the system would have automatically updated Salt Lake with all the transactions that had occurred in its absence.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website&lt;/a&gt; to find out more about how our disaster-proof architecture can protect businesses (and government agencies) of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-1035701724897832212?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/1035701724897832212/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/11/fixing-faas-single-point-of-failure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/1035701724897832212'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/1035701724897832212'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/11/fixing-faas-single-point-of-failure.html' title='Fixing The FAA’s Single Point of Failure'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-3581249347475943237</id><published>2009-10-22T10:55:00.000-07:00</published><updated>2009-11-14T11:02:17.764-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='email'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Enabling Cloud Confidence</title><content type='html'>A week ago I wrote about the Sidekick disaster and how events like that just keep doubts growing, pushing the wholesale adoption of the Cloud further away. This doubt has made it into the mainstream media, where it will taint the opinions of potential cloud users, both consumer and commercial. We at ZeroNines think we have the solution that will enable the cloud to perform as it needs to.&lt;br /&gt;&lt;br /&gt;The core problem with outages is not the existence of hazards that can damage servers and knock elements of a network (cloud or otherwise) offline. Storms, fires, and equipment failure will always happen and there is no way to eliminate them. The real problem is the reliance of cloud providers on obsolete failover-based recovery paradigms that simply can’t maintain continuity when disaster does strike.&lt;br /&gt;&lt;br /&gt;L.A. Times columnist David Sarno perfectly sums up the cloud’s tenuous situation in his October 18 article “Still hazy on cloud computers' security” [&lt;a href="http://www.latimes.com/business/la-fi-cloud19-2009oct19,0,5982033.story"&gt;source&lt;/a&gt;]. “A series of incidents involving cloud computing over the last several months has poked holes in the hype bubble, raising questions about the cloud's dependability -- and whether it's ready for use by a broader group of workers and businesses.” He is right on target.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Meeting the Need to Fortify&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;As Sarno puts it, “As e-mail, word processing and data storage continue to move from users' computers to the Web, companies must fortify their servers from a variety of potential disasters -- natural and man-made -- to help ensure that the data and the applications are accessible at all times.” He quotes Google’s SEC filing:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;"(Google’s) systems are vulnerable to damage or interruption from earthquakes, terrorist attacks, floods, fires, power loss, telecommunications failures, computer viruses, computer denial of service attacks" as well as sabotage and vandalism...&lt;br /&gt;&lt;/blockquote&gt;The good news is that today, ZeroNines' Always Available™ CloudNines™ technology can fortify servers from damage or interruption from earthquakes, terrorist attacks, floods, fires, power loss, telecommunications failures, computer denial of service attacks, as well as sabotage and vandalism. We leave the viruses to others to deal with, but we can add most types of routine maintenance, unplanned maintenance, data migrations, equipment upgrades, software upgrades, and a number of other potential causes of downtime.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Forget Failover&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;The IT world fatalistically believes that downtime is inevitable, and is something to be lived with and minimized if you’re fortunate. This view predominates because until now the only disaster recovery solution available has been the flawed failover paradigm, which everyone in IT knows can be a disaster unto itself. During a crisis or failover event, cutover can cause additional problems, downtime, and cascading application failures as computing switches from primary to backup systems.&lt;br /&gt;&lt;br /&gt;But the IT world has it wrong. Disasters will happen and must be dealt with, but the downtime they cause can be prevented.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Always Available™ Means Virtually 100% Uptime&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;ZeroNines’ Always Available™ solution eliminates failover and backups, instead providing synchronous identical processing on multiple cloud nodes geographically separated by thousands of miles. If a storm wipes out your East Coast cloud, CloudNines enables processing to continue on clouds in other parts of the country and around the world. If you need to upgrade server software, you can isolate one cloud node, do your upgrade, and bring it back online once it is stable. Our technology has journaling and updating features to assure that all transactions are completed and that any cloud node that goes offline is brought up to the most accurate logical state once it comes back online.&lt;br /&gt;&lt;br /&gt;CloudNines can push application availability beyond the industry-accepted standard of 99.999% (five nines) to virtually 100%. In our ongoing test case, the ZeroNines MyFailSafe environment has never experienced any downtime at all, for any reason. It went live in July 2004, and had individual network nodes knocked offline a number of times due to hurricanes, power outages, server migrations, and other causes. All applications experienced full 100% availability throughout.&lt;br /&gt;&lt;br /&gt;Will ZeroNines eventually be recognized as a vital cloud-enabling technology? That remains to be seen but you can bet that is how we see ourselves. If you want to find out how we can make the cloud a viable option for you, let me know.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website&lt;/a&gt; to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-3581249347475943237?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/3581249347475943237/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/10/enabling-cloud-confidence.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/3581249347475943237'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/3581249347475943237'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/10/enabling-cloud-confidence.html' title='Enabling Cloud Confidence'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-6378072193479191258</id><published>2009-10-15T09:34:00.000-07:00</published><updated>2009-10-23T09:46:01.488-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='T-Mobile'/><category scheme='http://www.blogger.com/atom/ns#' term='Sidekick'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Sidekick: Two Disasters for the Price of One Really Big One</title><content type='html'>&lt;p&gt;Already being called one of the largest data failures in recent memory, October’s Sidekick disaster was actually two disasters rolled into one. First, the cloud-based service suffered an outage which stranded thousands of users. Second, the backup/storage system failed and erased the personal data of thousands of users. Every failure like this leads to a round of hand-wringing over the cloud, and this one is no different. It underscores the need for a far more robust cloud architecture, where a failure in one area is truly isolated from the rest of the system and can’t cause an outage.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Background: Sidekick and the Cloud&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The Sidekick mobile device is developed by Microsoft subsidiary Danger and is sold and serviced by T-Mobile. It holds a special place in the hearts and hands of a select group of users because its QWERTY keyboard promises ease of use and its cloud-based data storage gives it the appearance of real go-anywhere, do-it-anytime utility. Unlike other hand-helds like the iPhone and Blackberry, the Sidekick backs up personal data to cloud-based storage at Microsoft and not to your computer’s hard drive. And there’s the seed of the trouble.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The problem: Hardware Failure Leads to Database Failure&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It seems that beginning at about 1:30 AM on Friday October 2 [&lt;a href="http://news.cnet.com/8301-13860_3-10368709-56.html"&gt;source&lt;/a&gt;], a “hardware failure… took out both the primary and backup copies of the database that contained Sidekick users' information.” [&lt;a href="http://news.cnet.com/8301-13860_3-10373500-56.html?tag=mncol;posts"&gt;source&lt;/a&gt;] This apparently occurred during an upgrade to the Danger/Microsoft Storage Area Network [&lt;a href="http://www.tuaw.com/2009/10/12/the-t-mobile-sidekick-data-failure-and-what-it-means-to-iphone/"&gt;source&lt;/a&gt;]. When they discovered their Sidekicks weren’t working, many users re-set their Sidekicks (some under instructions from T-Mobile customer service) which wiped the devices’ hard drives. Combined with the back-end server failure, this led to apparent permanent data loss for anyone who tried to re-set their Sidekicks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The cost to T-Mobile and Microsoft&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;This is going to cost millions. At least. T-Mobile halted sales of all Sidekicks shortly after the event and is compensating its affected users with a period of free data service [&lt;a href="http://news.cnet.com/8301-13860_3-10372921-56.html?tag=mncol;txt"&gt;source&lt;/a&gt;]. There were the usual rants about users refusing to continue paying on their contracts, and news that T-Mobile was voluntarily letting anyone out of their contract who wanted out [&lt;a href="http://news.cnet.com/8301-13860_3-10372921-56.html?tag=mncol;txt"&gt;source&lt;/a&gt;]. Lawsuits were filed [&lt;a href="http://news.cnet.com/8301-13860_3-10375240-56.html?tag=mncol;txt"&gt;source&lt;/a&gt;]. Sarcasm and criticism runs thick online. Whatever the actual facts, this is a marketing disaster of the greatest degree for T-Mobile and Microsoft. There is no way to calculate how many of the approximately 800,000 existing sidekick customers [&lt;a href="http://news.cnet.com/8301-13860_3-10372921-56.html?tag=mncol;txt"&gt;source&lt;/a&gt;] will jump ship, how many potential new customers will be lost, and what this means for Microsoft’s “Pink” project, intended follow-on to the Sidekick [&lt;a href="http://news.cnet.com/8301-13860_3-10360516-56.html?tag=mncol;txt"&gt;source&lt;/a&gt;]. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;The Solution: A Robust Cloud&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;ZeroNines’ CloudNines™ product enables the cloud to function as it is supposed to, by processing every transaction simultaneously and equally on multiple cloud-based network nodes in an Always Available™ configuration. In the Sidekick disaster, CloudNines would simply have cut off the node with the hardware failure. All processing would have continued on other geographically separated nodes that were running identical active instances of the affected applications and databases. The failure would have been contained. There would have been no service downtime, and no need for ill-advised attempts to re-boot individual Sidekicks.&lt;br /&gt;&lt;br /&gt;Not only would the Sidekick applications have continued operation, but the databases would too. There would have been no apparent loss of customer data. After the event, one author bitingly asked “But the question remains, why wasn't there a true independent backup of the data?” [&lt;a href="http://news.cnet.com/8301-13860_3-10373500-56.html?tag=mncol;posts"&gt;source&lt;/a&gt;]. ZeroNines and Always Available technology would have made this a moot point. &lt;/p&gt;&lt;p&gt;As of this writing, T-Mobile and Microsoft have announced that they “have recovered most, if not all, customer data” [&lt;a href="http://forums.t-mobile.com/tmbl/?category.id=Sidekick"&gt;source&lt;/a&gt;]. I can’t help but breathe a sigh of relief for them even though I am not a Sidekicker myself. But wouldn’t it have been far better to have avoided the problem in the first place?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website &lt;/a&gt;to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-6378072193479191258?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/6378072193479191258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/10/two-disasters-for-price-of-one-really.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/6378072193479191258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/6378072193479191258'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/10/two-disasters-for-price-of-one-really.html' title='Sidekick: Two Disasters for the Price of One Really Big One'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-4593843757019843256</id><published>2009-09-12T12:03:00.000-07:00</published><updated>2009-09-23T11:52:11.086-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='email'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Gmail'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>Gmail Maintenance Leads to Router Overload</title><content type='html'>It is often the mundane problems that cause the most trouble. The two-hour Gmail outage on Tuesday, September 1, 2009 had a fairly unspectacular cause, but its effects are shaking a tech giant’s plans and causing some commentators to wring their hands over the acceptance of SaaS offerings in general.&lt;br /&gt;&lt;br /&gt;Whatever its effects on the industry, this is one of several outages in the past year which are harming Google’s efforts to sell its email services as a corporate tool. At the very least it cost them a lot of money. Fortunately, such outages are avoidable.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: Google Hopes for Significant Gmail Revenue&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Gmail is Google’s free email app, and is used worldwide by millions of people. Gmail also has paid services, and Google is trying to build it up into a corporate app that can generate significant revenue. Analysts and customers alike have been watching it closely over the years to see if it really can grow into a reliable corporate power tool, but have been disappointed by a number of recent outages.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: A Classic Cascade Failure&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Last Tuesday’s problem “was caused by a classic cascade in which servers became overwhelmed with traffic in rapid succession” [&lt;a href="http://news.cnet.com/8301-30684_3-10323837-265.html"&gt;source&lt;/a&gt;]. Google had taken several Gmail servers offline for maintenance. Recent changes to routers were intended to increase routing efficiency, but instead caused some routers to become overloaded. Traffic got shunted to an increasingly small pool of available routers until the system collapsed.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Cost to Google&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Google wants to get more customers onto its paid Gmail service. The outage adds to the image of Gmail as being not stable enough for business use and makes it harder to persuade corporate users to actually pay for it.&lt;br /&gt;&lt;br /&gt;By way of compensation, Google “…added three days to year-long subscriptions to its corporate Google Apps email service, which costs $50 per-user-per-year.” [&lt;a href="http://www.reuters.com/article/rbssTechMediaTelecomNews/idUSN0238071420090902"&gt;source&lt;/a&gt;] This equates to approximately $50 million. Unfortunately, users would rather have uptime than compensation, and Google got a lot of bad publicity which will make it harder to get business users to switch from other offerings. [&lt;a href="http://www.reuters.com/article/rbssTechMediaTelecomNews/idUSN0238071420090902"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: A Network that Can Absorb Failures&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;“Google said it would focus on making sure that the request routers have sufficient headroom to handle future spikes in demand, as well as figuring out a way to make sure that problems in one sector can be isolated without bringing down the entire service.” [&lt;a href="http://news.cnet.com/8301-30684_3-10323837-265.html"&gt;source&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Isolation of problem servers or nodes is a core function of ZeroNines’ Always Available™ technology. If they had been using Always Available, Gmail could have tested their new router/server configuration in isolation while the rest of the network was left to operate in the usual way. The new configuration could have been rolled out one server at a time without interrupting service. If one or more of the newly configured routers became unstable, that failure would have been confined to just that sector and the rest of the Gmail network could have continued processing in its usual way.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website &lt;/a&gt;to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines &lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-4593843757019843256?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/4593843757019843256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/09/gmail-maintenance-leads-to-router.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4593843757019843256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4593843757019843256'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/09/gmail-maintenance-leads-to-router.html' title='Gmail Maintenance Leads to Router Overload'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-2926917024660932374</id><published>2009-08-10T10:38:00.000-07:00</published><updated>2009-09-12T12:09:47.917-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='PayPal'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>PayPal™ Network Equipment Failure Strands Consumers &amp; Merchants</title><content type='html'>&lt;blockquote&gt;“How can it be that a single piece of network hardware brings down a business critical part of a network? It used to be that all financial institutions ensured that they remained available at all times regardless of cost. Is it possible that the current crop of engineers don’t make this a must have feature of their designs?” (&lt;a href="https://www.thepaypalblog.com/2009/08/paypal%e2%80%99s-service-interruption/"&gt;source&lt;/a&gt;)&lt;/blockquote&gt;&lt;p&gt;&lt;br /&gt;These tough questions were posted by a user on the PayPal™ blog after a network equipment failure took down the popular payment service for a full hour worldwide on August 3, 2009 (&lt;a href="http://news.cnet.com/8301-1023_3-10302072-93.html"&gt;source&lt;/a&gt;). Today I’ll take a quick look at this outage and offer a solution for preventing similar disasters.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: PayPal and its Impact&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;PayPal is one of those breakthrough solutions that has enabled e-commerce to take off as it has. Its low fraud rate and ease of use may make it the prototype for apps that could replace credit card accounts as the preferred means of making online payments. By way of illustrating PayPal’s importance, here are some stats from the &lt;a href="https://www.paypal-media.com/documentdisplay.cfm?DocumentID=2260"&gt;company’s media site&lt;/a&gt;:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;PayPal's net Total Payment Volume for 2008, the total value of transactions, was $60 billion.&lt;/li&gt;&lt;li&gt;PayPal has 73 million active registered accounts (184 million total accounts).&lt;/li&gt;&lt;li&gt;PayPal supports payments in 19 currencies.&lt;/li&gt;&lt;li&gt;PayPal's revenues now represent 32% of eBay Inc. companywide revenues.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;The Problem: Network Equipment Failure&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Acording to an August 3 announcement by PayPal’s SVP Technology…&lt;br /&gt;&lt;/p&gt;&lt;blockquote&gt;“At around 10:30 am PT Monday, a network hardware failure resulted in a service interruption for all PayPal users worldwide. Everyone in our organization focused immediately on identifying the issue and getting PayPal up and running again. We accomplished that in about an hour. By approximately 3 pm PT, full service was restored across our platform." (&lt;a href="https://www.thepaypalblog.com/2009/08/paypal%e2%80%99s-service-interruption/"&gt;source&lt;/a&gt;)&lt;br /&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;br /&gt;At the rate PayPal transacts, one hour of downtime means about $7,000,000 in lost or delayed transactions (&lt;a href="http://news.cnet.com/8301-1023_3-10302072-93.html"&gt;source&lt;/a&gt;). Some comments from that and other blogs nicely illustrate the downstream effects:&lt;br /&gt;&lt;br /&gt;"We have been down for the better part of the day. We are still down. I am a very unhappy customer. This failure has cost me thousands of dollars." (&lt;a href="https://www.thepaypalblog.com/2009/08/paypal-experiencing-site-issues/"&gt;source&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;"I am beginning to question my use of the product considering there does not appear to be a high availability solution in place." (&lt;a href="https://www.thepaypalblog.com/2009/08/paypal%e2%80%99s-service-interruption/"&gt;source&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Remove the Single Point of Failure&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;ZeroNines’ Always Available™ technology is the high availability solution this PayPal customer was wishing for. It could have prevented last week’s downtime event because it processes network transactions synchronously on multiple data centers, clouds, or virtualized environments through multiple network paths. There is no hierarchy and no single point of failure. In case of an equipment failure, power outage, application crash, storm, or other catastrophe in one area, processing would simply continue via other network nodes, switches, and data centers. The business disaster does not occur because users never lose access to the apps, data, and services they need.&lt;br /&gt;&lt;br /&gt;Always Available would also have enabled automatic update of the errant server once it was brought back online, preventing PayPal from having to use “everyone in their organization” just go get things going again. Of course the IT department still needs to replace the failed equipment, but that can be done in isolation as the rest of the business carries on as usual.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website &lt;/a&gt;to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; –&lt;/em&gt; &lt;em&gt;Founder &amp;amp; CEO, ZeroNines&lt;/em&gt; &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-2926917024660932374?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/2926917024660932374/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/08/paypal-network-equipment-failure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2926917024660932374'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/2926917024660932374'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/08/paypal-network-equipment-failure.html' title='PayPal™ Network Equipment Failure Strands Consumers &amp; Merchants'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-4791646905100704651</id><published>2009-07-08T15:08:00.000-07:00</published><updated>2009-08-19T10:53:49.434-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>Seattle Database Fire Unnecessarily Shuts Down Businesses and Online Services</title><content type='html'>Cascade failure. If you’re in IT, that’s a particularly frightening term. In the case of last week’s Seattle data center fire, the term is especially appropriate since it was literally a cascade of water that wrecked everything and sent a number of businesses and online services offline. Here’s a look at this disaster and a way it could have been prevented.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: Fisher Plaza, a Major Hosting Facility&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Fisher Plaza is “a self-styled carrier hotel in Seattle, and home to multiple datacenter and colocation providers.” [&lt;a href="http://www.geek.com/articles/news/fire-in-seattle-server-hosting-room-takes-down-many-internet-businesses-2009076/"&gt;source&lt;/a&gt;] A partial list of organizations hosted there includes: payment service provider Authorize.net (which itself has 238,000 merchant customers), Port of Seattle email system, Swedish Hospital’s internal IT systems, Pacific Science Center website, geocaching.com website, major TV and radio station KOMO, online Facebook game &lt;em&gt;Bejeweled Blitz&lt;/em&gt; and dozens of other businesses [&lt;a href="http://www.betanews.com/article/Fire-in-downtown-Seattle-data-center-knocks-out-businesses-online-services/1246640587"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: Fire Leads to Cascade Failure&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Early on Friday morning, July 3, 2009, Fisher Plaza’s main generator/transfer switch failed. This caused an overload. This caused a fire. This triggered the fire suppression system and brought firefighters to the scene, both of which shot water into the generator room. The generators stopped, and we deduce that power from the grid was shut off too. The UPS and the cooling system also failed. Temperatures in the facility rose high enough to wreck some servers and destroy data [&lt;a href="http://www.geek.com/articles/news/fire-in-seattle-server-hosting-room-takes-down-many-internet-businesses-2009076/"&gt;source&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;Think about the downstream effects. 238,000 merchants potentially have their transactions interrupted or lost because Authorize.net’s servers are forced offline. One can only hope they had their own functioning backup plan. A hospital’s IT system became unavailable; I have no information on what impact this had on patient care. And apparently KOMO had to transmit from a mobile unit in their parking lot [&lt;a href="http://www.betanews.com/article/Fire-in-downtown-Seattle-data-center-knocks-out-businesses-online-services/1246640587"&gt;source&lt;/a&gt;]. It is not hard to imagine the impact to these and other organizations.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Fire- and Flood-Proof Hosting&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;No, ZeroNines does not wrap servers in asbestos. There is no way to know what bizarre little accident will happen next, so prevention is unlikely. Some will trigger chain reactions that become major IT disasters.&lt;br /&gt;&lt;br /&gt;What we do is to prevent a catastrophe in one place from knocking out a business everyplace. In this case, if any of the clients or tenants at Fisher Plaza had been using our technology, their data, transactions, apps, and other assets would have all been processing simultaneously and in perfect replication in other data centers hundreds or thousands of miles away.&lt;br /&gt;&lt;br /&gt;This is not a cutover scenario. Processing would not have “switched” from Seattle to elsewhere. It simply would have stopped in Seattle and &lt;em&gt;continued in real time&lt;/em&gt; in San Jose, or Denver, or Singapore, or wherever else they placed their data centers. There would be no loss of business continuity. Their businesses would not have gone down, and the real disaster – lost connectivity, productivity, and revenue – would not have taken place.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website&lt;/a&gt; to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-4791646905100704651?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/4791646905100704651/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/07/seattle-database-fire-unnecessarily.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4791646905100704651'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/4791646905100704651'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/07/seattle-database-fire-unnecessarily.html' title='Seattle Database Fire Unnecessarily Shuts Down Businesses and Online Services'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-7059123976972689738</id><published>2009-06-30T14:52:00.000-07:00</published><updated>2009-08-19T14:59:08.557-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Uptime and the Cloud Crowd at CSIA</title><content type='html'>A few days ago, Jake Smith of Intel and I presented at the Colorado Software Industry Association (CSIA) monthly meeting in Denver (&lt;a href="http://www.rockyradar.com/infotech/?p=440"&gt;source&lt;/a&gt;). We talked about cloud computing and the elements that will determine its rate of adoption: the needs of businesses, their expectations of cloud performance, and the real-world limitations of the cloud that are currently stalling its adoption. The biggest issue is reliability, and I introduced ZeroNines’ technology as a potential solution. It was a great crowd, and their hunger for a reliable cloud was obvious.&lt;br /&gt;&lt;br /&gt;Businesses need their applications and data to be available all the time. So far, clouds and cloud providers have not succeeded in proving that they can actually offer that. The industry needs to overcome the cloud’s downtime problems before serious business can be done on it. I believe the Big Three (Amazon, Azure, and Google) will refocus their efforts on providing highly available cloud infrastructures and market this capability accordingly.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Cause is Academic&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Of course every network is subject to threats and failures that can cause downtime, and there’s no getting away from that. It doesn’t take an earthquake to knock vital networked apps offline; some recent high-profile cloud provider outages have shown that all it takes is a failed OS upgrade. New and unexpected problems crop up every day. But the cause of an outage is really only academic for the business relying on the cloud. Service should simply continue because the business needs it to.&lt;br /&gt;&lt;br /&gt;The scary thing is that the current disaster recovery paradigm (failover) is insufficient for protecting businesses when these things happen, and can’t be relied upon to prevent downtime or even a speedy recovery. In addition, there is an increase in catastrophic risk from poorly architected virtualized environments, and most notably in server consolidation, which is a core technology of the cloud.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution is Continuity&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;At the CSIA meeting, we introduced the crowd to our Always Available™ technology, which maintains cloud continuity by synchronizing and protecting multiple private, public or hybrid clouds. It can mix cloud computing and physical hosting via datacenters hundreds or thousands of miles apart. The distance prevents any single regional disaster from damaging more than one data center. There is no server hierarchy, so all transactions run simultaneously and equally on all cloud and server nodes. Best of all, they update each other constantly in real time so if one goes down the others simply continue processing with no interruption to service.&lt;br /&gt;&lt;br /&gt;To protect against an outage during an upgrade, I would postulate the following solution: Isolate one cloud or network node in an Always Available configuration and do your upgrade there, while the other nodes manage the clients’ transactions. Test the upgrade and slowly roll it out to the other nodes. If things start to go haywire, isolate the misbehaving node, solve your problems, and start the rollout again. There would be no need to risk the entire service on an untested upgrade.&lt;br /&gt;&lt;br /&gt;Always Available works for cloud customers as well as service providers. It is provider- and platform-agnostic, so you can mix and match all you need to.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Visit the ZeroNines website &lt;/a&gt;to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-7059123976972689738?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/7059123976972689738/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/08/uptime-and-cloud-crowd-at-csia.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7059123976972689738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/7059123976972689738'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/08/uptime-and-cloud-crowd-at-csia.html' title='Uptime and the Cloud Crowd at CSIA'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-8857868155129331855</id><published>2009-01-12T09:00:00.000-08:00</published><updated>2009-07-14T15:08:19.973-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='outage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='Always Available'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>Hurricane Charley Couldn’t Stop the Email</title><content type='html'>In this first Disaster Litany Posting, I look at a sequence of events that is near and dear to ZeroNines. Our own real-world experience with a hurricane, power outages, and an email system will show just how our downtime-preventing technology works, and serve as a pattern for the solutions we suggest for other disasters.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Background: MyFailSafe™ Email System&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;ZeroNines offers Always Available technology that can virtually eliminate downtime among networked applications, data, and other assets. To test our technology, we created the MyFailSafe Email Service and launched it on our Always Available network in July of 2004. This was specifically intended to test Always Available in the real world, by running MyFailSafe just like any other email service is run, with real customers and real traffic, and subject to the same threats that any other network or email system is vulnerable to.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Problem: A Hurricane&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;All readers who remember Hurricane Charley please raise your hands… For those of you who don’t, Charley hit Florida on August 13, 2004. According to Wikipedia, it killed about thirty people and caused $15 billion in damage. Widespread flooding, wind damage, power outages, and other problems crippled much of the state for several days. I don’t have statistics on downtime among private business networks or service providers, but it’s a safe bet that it was serious.&lt;br /&gt;&lt;br /&gt;Charley hit about a month after we launched MyFailSafe. It caused electrical grid fluctuations that drained the Orlando local exchange carrier battery backup systems, isolating the Orlando node of the ZeroNines Always Available infrastructure. Our own battery system prevailed and still had a 75% charge when commercial power was reliably restored, but the site could not communicate for 16 hours because of LEC downtime.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Solution: Hurricane-Proof Architecture&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;During this 16 hours, when our Orlando node was effectively offline, the MyFailSafe email service did not experience any downtime at all. Any user whose power was still on and whose desk was not under water experienced true 100% uptime throughout, whether they were in Florida, Colorado, Canada, Asia, or anywhere else.&lt;br /&gt;&lt;br /&gt;How? Our Always Available deployment has additional nodes and data centers in Colorado and California. All applications, transactions, data exchanges, and other network activities run equally and simultaneously on these multiple secure application servers, geographically separated by hundreds of miles. In IT parlance, all servers are hot, and all instances of all applications are active. There is no server hierarchy, and consequently no single point of failure. When the Orlando node fell silent, all MyFailSafe processing continued uninterrupted on the others. There was no need for failover or recovery because these other nodes were far from the storm, they never went down, and continuity was maintained.&lt;br /&gt;&lt;br /&gt;Since activation on July 15, 2004, the MyFailSafe network has never experienced any downtime for any reason, including this and other hurricanes, two migrations from server collocation providers to clouds, a data center move, and an email worm attack that interrupted email service from AOL and other major providers. These potential disasters, which forced our servers offline, had no power to bring our applications down. All applications and information retained 100% availability throughout.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Contact ZeroNines&lt;/a&gt; to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – Founder &amp;amp; CEO, ZeroNines&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-8857868155129331855?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/8857868155129331855/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/07/hurricane-charley-couldnt-stop-email.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8857868155129331855'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8857868155129331855'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/07/hurricane-charley-couldnt-stop-email.html' title='Hurricane Charley Couldn’t Stop the Email'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-8865121368856679601</id><published>2009-01-03T08:00:00.000-08:00</published><updated>2009-07-08T11:27:44.918-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uptime'/><category scheme='http://www.blogger.com/atom/ns#' term='failsafe'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='downtime'/><category scheme='http://www.blogger.com/atom/ns#' term='continuity'/><title type='text'>A Litany of Disasters: Downtime Events and How to Avoid Them</title><content type='html'>&lt;strong&gt;“Aviation in itself is not inherently dangerous. But to an even greater degree than the sea, it is terribly unforgiving of any carelessness, incapacity, or neglect."&lt;br /&gt;-- Anonymous&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Years ago, I saw those words on a poster of a World War One aircraft stuck about 20 feet off the ground in the limbs of a tree. If we were to update this and adapt it to the business user’s desktop, it would lose its poetic charm but strike home with a whole new audience:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;“Networked assets in themselves are not inherently dangerous. But to an even greater degree than stuff on your hard drive, they are terribly unforgiving of any carelessness, incapacity, or neglect."&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;The warning is clear: Disaster may be only inches away, particularly for the unprepared. It’s a lot harder to recover after some accident knocks out a hundred or a thousand users than it is to re-boot your own machine.&lt;br /&gt;&lt;br /&gt;In this blog, we will be looking at some actual disasters that have struck organizations when their networks have taken a hit from storms, fires, attacks, and far more mundane threats like human error and equipment failure.&lt;br /&gt;&lt;br /&gt;For a business, there may be little correlation between the physical effects of a disaster and its financial impact. Imagine a business dependent upon a distant data center in the U.S. Tornado Belt. One good storm could leave their personnel and property untouched, yet destroy their ability to do business by wiping out their data, applications, and transactions. Elsewhere, an earthquake could cause deplorable loss of life and property damage, yet leave a business relatively unharmed if its networked computing capabilities remain intact. And an otherwise strong corporation could suffer irreparable damage by something as quiet as a software failure or equipment malfunction, which to the outside world does not qualify as a “disaster” at all.&lt;br /&gt;&lt;br /&gt;I’ll be describing some instances where ZeroNines’ solutions for networks, virtualized environments, and clouds could have prevented disastrous downtime, and helped avoid unwanted headlines and losses to productivity, reputation, and revenue. Our approach does not use any kind of failover or cutover, since those occur after the downtime event and are not true disaster prevention. After all, it’s far better to avoid the downtime in the first place than to try to recover from it afterward.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Next week:&lt;/strong&gt; How MyFailSafe really did provide fail-safe email during Hurricane Charley.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.zeronines.com/"&gt;Contact ZeroNines&lt;/a&gt; to find out more about how our disaster-proof architecture protects businesses of any description from downtime.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;strong&gt;Alan Gin&lt;/strong&gt; – ZeroNines, Founder &amp;amp; CEO&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-8865121368856679601?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/8865121368856679601/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2009/07/litany-of-disasters-downtime-events-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8865121368856679601'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8865121368856679601'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2009/07/litany-of-disasters-downtime-events-and.html' title='A Litany of Disasters: Downtime Events and How to Avoid Them'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2171310898568852604.post-8780728266440749716</id><published>2008-09-15T07:00:00.000-07:00</published><updated>2009-07-10T14:24:45.785-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cloud storage'/><category scheme='http://www.blogger.com/atom/ns#' term='ZeroNines'/><category scheme='http://www.blogger.com/atom/ns#' term='disaster recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Cloud Computing'/><title type='text'>Cloud Failure: The Myth of Nines</title><content type='html'>&lt;a href="http://elasticvapor.com/2008/09/cloud-failure-myth-of-nines.html"&gt;Visit Reuven Cohen's blog&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For about as long as there have been computer networks, administrators have attempted to keep these networks up and running. It seems to be a continuous battle between faulty hardware, poorly written software, unreliable connectivity and random acts of God. With the emergence of cloud computing we are now for the first time close to realizing a computing environment where we are able to focus less on keeping our applications up and more on making them run more efficiently and effectively.&lt;br /&gt;&lt;br /&gt;In the era of cloud computing uptime guarantees and service level agreements (SLA) have started to become standard requirements for most cloud providers. Google, Amazon, and Microsoft have all started to implement some kind of SLA. They do this in an attempt to give their cloud users the confidence to utilize these systems in place of more common in house alternatives. The common goal for most of these cloud platform is to build for what I consider the myth of five nines. (Five nines meaning 99.999% availability, which translates to a total downtime of approximately five minutes and fifteen seconds per year.) The problem with five nines is it's a meaningless goal which can be manipulated to meet what ever you need it to mean.&lt;br /&gt;&lt;br /&gt;In the case of a physical failure such as Flexiscales recent one, the hardware downtime might be small, but the time to restore from a backup might be considerably longer. A minor cloud failure could cause a cascading series of software failures causing further application outage of hours or even days for those who depended on the availability of the given cloud. Meaning your cloud may achive five nines, but your application hosted on it doesn't.&lt;br /&gt;&lt;br /&gt;Lately it seems there are a number of people in the cloud computing community who are starting to discuss alternatives to the dreaded five nines concept and looking at ways that cloud based infrastructures could be configured / deployed in a mannor that is more proactive than reactive to disasters. There is a growing consensus that cloud based disaster recovery may very well be the "killer app" for cloud computing. To achieve this, we need to start creating reference architectures and models that assume for failure. One that doesn't need to worry when the next disaster will happen next, just that it will happen and when it does, it's going to be business as usual.&lt;br /&gt;&lt;br /&gt;In a recent conversation with Alan Gin founder of a super secret stealth firm called Zeronines, Alan described an interesting philosophy. He said the problem with most disaster recovery plans is the recovery is reactive, it is what happens after a disaster has already harmed your business. He said on its face, this is an unsound strategy. He went on to say; That current disaster recovery architectures, which uses the synonym “failover,” is based on the cutover archetype: a system’s primary component fails, damaging operations; then failover to a secondary component is attempted to resume operations. The problem with current cutover approaches is that it views unplanned downtime as inevitable, acceptable, and so requires that business halt.&lt;br /&gt;&lt;br /&gt;I really liked this quote from an executive from EMC, a leading computer storage equipment firm, “current failover infrastructures are failures waiting to happen.”&lt;br /&gt;&lt;br /&gt;To be competitive in today's always connected, always available world. We need to reinvent the fundamental idea of disaster recovery. One of the major benefits to using cloud computing is that you can make these types of failover assumptions well before they happen using an emerging global toolset of cloud components. It's not a matter of if, but a matter of when, when you take into consideration that application components will fail then you can build an application that features "failure as service". One that is always available, one with Zero Nines.&lt;br /&gt;&lt;br /&gt;Reuven CohenFounder &amp;amp; chief technologist for Toronto based &lt;a href="http://www.enomaly.com/"&gt;Enomaly&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2171310898568852604-8780728266440749716?l=zeronines.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://zeronines.blogspot.com/feeds/8780728266440749716/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://zeronines.blogspot.com/2008/09/cloud-failure-myth-of-nines.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8780728266440749716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2171310898568852604/posts/default/8780728266440749716'/><link rel='alternate' type='text/html' href='http://zeronines.blogspot.com/2008/09/cloud-failure-myth-of-nines.html' title='Cloud Failure: The Myth of Nines'/><author><name>ZeroNines® Technology, Inc.</name><uri>http://www.blogger.com/profile/11020395597947872239</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='19' height='32' src='http://3.bp.blogspot.com/_Cv9Idhtm2i0/SSm3v-a0y6I/AAAAAAAAABU/2kqlaZN9Xuc/S220/Z9_color_logo.jpg'/></author><thr:total>0</thr:total></entry></feed>
