I had a heart attack back in 2008. I was lucky. My local emergency room facility and the intensive care unit hospital that I was transferred to happened to share my medical records in electronic format. But only about 10% of U.S. hospitals use electronic records so if this had happened away from home I probably would have died because no other doctor or hospital would have known about my pre-existing medical conditions.
It was suddenly very easy for me to see the need for a system that would allow consumers to take their medical records with them wherever they go. Not only for emergencies but for everyday reference. Some quick Googling revealed Personal Health Record (PHR) solutions from Microsoft (HealthVault), Google (Google Health) and a large number of others, but consumer adoption was low. I also discovered that the Electronic Medical Records (EMRs) used by hospitals and doctors were no solution because they are inaccessible to consumers and practitioners outside the system.
I enlisted the help of my personal doctors, friends and classmates who work in the healthcare field as well as other technologists who are consulting to large medical organizations around the country. All told, we have consulted with 36 experts who freely gave us their opinions about the issues surrounding EMRs and how a comprehensive PHR should be designed in order to deliver high value to consumers while potentially saving lives. I summarize the issues in BOLD and describe how we address them.
So today we at ZeroNines introduced ZenVault Medical (www.zenvault.com/medical), a Cloud-based, private, encrypted, online PHR for consumers that you can access through a computer or mobile device. In addition to helping people with their medical care, it’s a great example of how the Cloud and other cutting-edge technologies can come together to create a unique and valuable consumer product.
Background: The Need for Digital Medical Records
If you’re like most people, your medical records are scattered among a number of doctors and they are hard to get to. The Obama administration wants the country to convert to Electronic Medical Records. The goal is to improve healthcare and cut costs by making an individual’s collection of medical records available electronically at any hospital or doctor’s office, cutting down on paper volume, saving time, and increasing accessibility particularly in emergencies. This truly needs to happen – my own experience proves that – but the issue is how.
The Problem: Security, Privacy, and Reliability
Questions surrounding security and privacy make many citizens and consumer advocates reluctant to jump on board. Will such a system be run by the government or by business? Who will have access? Will sensitive personal information about illnesses, prescriptions, and treatments be turned over to insurance companies? To marketers? To employers? Can any body of law successfully regulate how such highly personal information is handled and protected, enabling it to benefit the individual yet keeping it out of the hands of those who would profit by violating privacy? Is it even the government’s place to get involved with personal medical records? And what technology is secure enough to handle all this?
Security: Any medical records system needs to keep hackers at bay. Well-publicized data breaches with Microsoft and Google call into question their ability to protect medical privacy. Frankly, I decided to subscribe to one of these systems before we came up with ZenVault, but was concerned with who might be accessing my records and selling it to insurance companies and marketing firms.
Privacy: Many companies offering free digital medical records turn around and sell customer data to pharmaceutical and insurance companies. And a September 16 2010 article in the Wall Street Journal described a data breach wherein a Google engineer broke the company’s privacy policies by accessing private customer information.
Reliability:If anything needs 100% uptime, it’s medical applications. Take a look at some of the high-profile downtime events discussed in the rest of this blog and then imagine the cost in lives and well-being if they had affected hospital emergency rooms.
The Solution: Customer Control of a Safe, Secure, and Always Available™ Personal Health Record
Simply putting control of the health record in the hands of the individual consumer or patient addresses the bulk of these concerns. If no one can read the record but the customer, that’s most of the battle won. So what is the difference between ZenVault Medical and other consumer-facing PHRs like Google Health and HealthVault?
Security: ZenVault encrypts stored records with a patent-pending variant of the NSA-approved encryption protocols that protect top-secret information. ZenVault does not employ a “key ring” that stores customer encryption keys which means there is no copy available for anyone to find and rummage through your data. The customer creates his or her own unique encryption key so only they can access and edit their private medical records. SSL-secured sessions protect data in transit from computers, smartphones, and tablets.
Privacy: ZenVault never shares information. Period. We don’t sell it, rent it, or give it away, not even in a “sanitized” format like some admit to doing. We charge consumers for our service and our business model is based on customer trust. If they don’t trust us we lose. In fact, our encryption system prevents even our own engineers and administrators from reading patient data, so we couldn’t sell it even if we wanted to. How’s that for a guarantee?
Reliability: ZenVault uses ZeroNines' Always Available™ technology designed to protect the world's most sensitive financial and military computer systems. There is virtually no "downtime" or data loss with ZenVault. A Cloud-based infrastructure helps keep costs down, ensures scalability, and supports universal accessibility. Use of Always Available allays any concerns over Cloud reliability. In fact, we intend to use ZenVault as an example of a highly reliable, high-usage application deployed in the Cloud. Read more about Always Available on the ZeroNines.com website ZeroNines.com website.
Convenience: Users can update or read their records anywhere they have Internet access. They can send their records to any doctor with just a few clicks using a secure message system. Have you ever wasted time at a doctor appointment filling out a clipboard full of medical history forms? Use ZenVault to send them your PHR instead! Doctors can send patients their records, lab results, and x-rays with equal ease.
Affordable: A free account is available, offering a basic PHR with full security, encryption, and privacy protection. A premium account adds advanced features for a small monthly charge.
Secure Emergency Room Access: ZenVault offers emergency rooms their own accounts with their own special encryption keys. They get controlled access to six key fields in a patient’s record such as history of heart disease, drug sensitivities, and emergency contact information. This gives them the basic information they need to save a life and contact loved ones yet protects the majority of personal information until the patient or their family elects to release it.
Take Your Personal Health Record with You
If you have Internet access, you can use ZenVault. I hope none of you ever has a medical emergency like the one that sent me to the hospital two years ago. But if you do, ZenVault could save your life by putting the needed information in the right place, at the right time. I have no doubt that one day a universal health record database will be a reality, but until then you can have all the benefits while keeping control yourself. Try it out and let me know what you think: www.zenvault.com/medical.
Visit the ZeroNines.com website to find out more about how our disaster-proof architecture can protect businesses of any description from downtime.
Alan Gin – Founder & CEO, ZeroNines
October 11, 2010
May 24, 2010
TD AMERITRADE Outage and How Failover Fails Finance
Online brokerage TD AMERITRADE was offline for 80 minutes on Thursday May 20, 2010 [source]. Because of the outage, some of their clients could not log in to their accounts to place trades during the powerful market downdraft that occurred that day [source]. Outages among financial firms have gotten a lot of coverage in the last couple years, no doubt because of the universally amped-up sensitivity to any kind of news with the word “financial” attached to it. Here’s a brief look at this outage, and some commentary on outages in general among financial companies.
Background: About TD AMERITRADE
Online discount broker TD AMERITRADE has millions of U.S. customers (Wikipedia reports over six million), and many more internationally. The company has grown rapidly through acquisition and was the 746th-largest US firm in 2008 [source]. It acquired thinkorswim Group, Inc., another popular online brokerage, in January 2009. Lots of average Americans use TD AMERITRADE to generate income and manage retirement accounts. I use them myself and really like their system, but did not notice the outage because I was doing other things at the time.
The Problem: An Outage of Some Kind
At about 11:40 AM Eastern time, clients found that they could not log on to the TD AMERITRADE retail website. The outage ended at about 1:00 PM. No disruption was reported on their mobile site or at their subsidiary thinkorswim [source]. Clients already logged in experienced no trouble, urging one writer to speculate that it was a web authorization issue of some kind. [source]. If TD AMERITRADE has made a formal announcement of the cause, a half hour of Googling on my part failed to find it.
Was This a Failed Failover?
Posted on the TD AMERITRADE site [source] is the “TD AMERITRADE Business Continuity Plan Statement” [source]. One of the statements in this brief public document reads “Disruption of service at any of our service centers will result in calls, orders and electronic communications being re-routed to an alternative service center located in a different region of the country with a separate power grid and transportation system.”
Let me state clearly that I am entering the realm of speculation here. The statement quoted above implies that TD AMERITRADE is relying on a business continuity plan based on failover architecture. Failover or cutover has been the de-facto choice for business continuity and until recently it has been the only real game in town. But it is by nature unreliable and even the best systems are subject to downtime. If their backup plan is indeed based on failover, then failover obviously failed them.
The Cost: As Always, it’s the Intangibles
As in so many outages of this kind, the real costs are difficult to estimate. Easiest to ponder are the lost commissions from trades that could not occur during an extremely busy trading day. Less tangible are the effects on reputation and customer satisfaction. No one wants a broker that is unavailable when they need them most. One customer claimed to have lost about $2,000 from being unable to log in [source]. TD AMERITRADE stock fell about 3.7% that day but this may not mean much because markets overall were down about 3%.
According to a May 2007 article from Financial Services Technology, a study from the Meta Group revealed that “the cost per hour for downtime – ranging from simple network outages to major emergencies – in the financial services sector is, on average, $1.4 million” [source]
An Ugly Thought: Downtime among High Frequency Traders
For many, the cost will be far higher. Some banks, hedge funds, and other high-power financial firms engaged in High Frequency Trading (HFT) make billions of trades a day over ultra-high speed connections [source]. Many trades live for only a few seconds. Enormous transactions are conceived and executed in half a second, with computers evaluating the latest news and acting on it well before human traders even know what the news is. HFT is having a significant effect on markets; there is evidence that the history-making “Flash Crash” of May 6 2010 was caused and then largely corrected by High Frequency Trading [source]. What would happen if one of these HFT systems was down for an hour and a half? Or even just a minute? Whatever your stance on the ethics of HFT, I think it fair to say that those engaged in it need to avoid downtime at all costs.
Failover Can’t Handle It
Even a successful failover event may cause some glitches and lost trades among the average retail trading populace. But if a High Frequency Trading system experiences such a glitch, billions of dollars could be lost in the blink of an eye. The trades themselves may fail, and by the time the system comes back up the conditions that made those trades possible are a thing of the past. And that’s for a successful failover. A failed failover can leave businesses out of the race for minutes, hours, and even days.
The Alternative: Active/Active Architecture
High profile financial systems clearly need something better than failover. The typical outage is caused by failures of server hardware, server software, upgrades, maintenance, and sometimes more dramatic stuff like fires and floods. The best protection in these cases is to eliminate failover entirely, and switch to an “active/active” or “hot/hot” architecture that eliminates the chance of a failed cutover and the resultant downtime. Always Available™ business continuity architecture from ZeroNines is one such system. Always Available processes all network transactions continually, simultaneously, and equally in multiple locations on multiple servers, all of which are hot and all of which are active. Always Available can offer virtually 100% uptime, because instead of relying on failover Always Available simply continues running the same apps and data at two or three additional locations, with no interruption to the user. So if a web server or database goes down somewhere, the other nodes of the system continue processing without missing a beat. Visit the ZeroNines website to find out more.
Alan Gin – Founder & CEO, ZeroNines
Background: About TD AMERITRADE
Online discount broker TD AMERITRADE has millions of U.S. customers (Wikipedia reports over six million), and many more internationally. The company has grown rapidly through acquisition and was the 746th-largest US firm in 2008 [source]. It acquired thinkorswim Group, Inc., another popular online brokerage, in January 2009. Lots of average Americans use TD AMERITRADE to generate income and manage retirement accounts. I use them myself and really like their system, but did not notice the outage because I was doing other things at the time.
The Problem: An Outage of Some Kind
At about 11:40 AM Eastern time, clients found that they could not log on to the TD AMERITRADE retail website. The outage ended at about 1:00 PM. No disruption was reported on their mobile site or at their subsidiary thinkorswim [source]. Clients already logged in experienced no trouble, urging one writer to speculate that it was a web authorization issue of some kind. [source]. If TD AMERITRADE has made a formal announcement of the cause, a half hour of Googling on my part failed to find it.
Was This a Failed Failover?
Posted on the TD AMERITRADE site [source] is the “TD AMERITRADE Business Continuity Plan Statement” [source]. One of the statements in this brief public document reads “Disruption of service at any of our service centers will result in calls, orders and electronic communications being re-routed to an alternative service center located in a different region of the country with a separate power grid and transportation system.”
Let me state clearly that I am entering the realm of speculation here. The statement quoted above implies that TD AMERITRADE is relying on a business continuity plan based on failover architecture. Failover or cutover has been the de-facto choice for business continuity and until recently it has been the only real game in town. But it is by nature unreliable and even the best systems are subject to downtime. If their backup plan is indeed based on failover, then failover obviously failed them.
The Cost: As Always, it’s the Intangibles
As in so many outages of this kind, the real costs are difficult to estimate. Easiest to ponder are the lost commissions from trades that could not occur during an extremely busy trading day. Less tangible are the effects on reputation and customer satisfaction. No one wants a broker that is unavailable when they need them most. One customer claimed to have lost about $2,000 from being unable to log in [source]. TD AMERITRADE stock fell about 3.7% that day but this may not mean much because markets overall were down about 3%.
According to a May 2007 article from Financial Services Technology, a study from the Meta Group revealed that “the cost per hour for downtime – ranging from simple network outages to major emergencies – in the financial services sector is, on average, $1.4 million” [source]
An Ugly Thought: Downtime among High Frequency Traders
For many, the cost will be far higher. Some banks, hedge funds, and other high-power financial firms engaged in High Frequency Trading (HFT) make billions of trades a day over ultra-high speed connections [source]. Many trades live for only a few seconds. Enormous transactions are conceived and executed in half a second, with computers evaluating the latest news and acting on it well before human traders even know what the news is. HFT is having a significant effect on markets; there is evidence that the history-making “Flash Crash” of May 6 2010 was caused and then largely corrected by High Frequency Trading [source]. What would happen if one of these HFT systems was down for an hour and a half? Or even just a minute? Whatever your stance on the ethics of HFT, I think it fair to say that those engaged in it need to avoid downtime at all costs.
Failover Can’t Handle It
Even a successful failover event may cause some glitches and lost trades among the average retail trading populace. But if a High Frequency Trading system experiences such a glitch, billions of dollars could be lost in the blink of an eye. The trades themselves may fail, and by the time the system comes back up the conditions that made those trades possible are a thing of the past. And that’s for a successful failover. A failed failover can leave businesses out of the race for minutes, hours, and even days.
The Alternative: Active/Active Architecture
High profile financial systems clearly need something better than failover. The typical outage is caused by failures of server hardware, server software, upgrades, maintenance, and sometimes more dramatic stuff like fires and floods. The best protection in these cases is to eliminate failover entirely, and switch to an “active/active” or “hot/hot” architecture that eliminates the chance of a failed cutover and the resultant downtime. Always Available™ business continuity architecture from ZeroNines is one such system. Always Available processes all network transactions continually, simultaneously, and equally in multiple locations on multiple servers, all of which are hot and all of which are active. Always Available can offer virtually 100% uptime, because instead of relying on failover Always Available simply continues running the same apps and data at two or three additional locations, with no interruption to the user. So if a web server or database goes down somewhere, the other nodes of the system continue processing without missing a beat. Visit the ZeroNines website to find out more.
Alan Gin – Founder & CEO, ZeroNines
January 22, 2010
Twitter Grows Up and then Falls Down
Do you Twitter? Or Tweet? Or whatever they call it? Gotta admit, I don’t. So I didn’t really pay much attention when I first heard that Twitter had gone down the other day [source]. Life goes on. But what a good thing (I thought) that this had not happened a week ago when Twitter took the spotlight on the world stage as it helped gather money for earthquake relief in Haiti.
That made positive headlines everywhere. But if the outage had occurred during the first critical hours or days of the relief effort, a self-righteous world would instead have sneered at Twitter for having failed, despite the fact that Twitter was never billed as a source of disaster relief. This is a window into an important reality: you’d better plan uptime into your system because you never know when you will be caught in the spotlight.
Background: Twitter Comes of Age
Twitter is an instant messaging system that allows short messages of up to 140 characters to be sent to a subscriber’s contacts, or be made available to the Twitter community at large. Millions of people use Twitter every day. Data reported in Wikipedia [source] shows that over 75% of the messages on Twitter are either “conversational” or “pointless babble.” A small but powerful percentage of messages are for far more serious purposes. Twitter was drafted into service for political campaigning, education, public relations, and emergencies long before the Haiti earthquake. But I see its Haiti relief efforts as the moment it came of age, when Twitter was first used to mobilize money on a mass worldwide scale for a focused, responsible, humanitarian purpose.
The Problem: A Failover Failure
On the morning of Wednesday January 20, 2010, Twitter became virtually inaccessible. According to Information Week, "A sudden failure coupled with problems in switching to a backup system produced a high number of errors for around 90 minutes" [source]. In other words, an unspecified failure in one place forced the system to rely on its “failover” architecture, which in turn failed. This is a classic failover failure.
The Cost: Hard to Quantify but Scary to Contemplate
Since Twitter service is free, there may be no direct cost to Twitter. Indirectly, this event contributes negatively to Twitter’s overall equation for obtaining venture capital, building a positive public image, and eventually making money off of paid services.
And here’s where we get into very uncertain territory. What would the cost have been if it had happened just a few days earlier? Would millions of dollars in aid to Haiti have been delayed or failed to materialize? Would people who were saved by this aid have died? Possibly. At the very least, Twitter would have experienced a PR storm far more serious than the January 20 outage caused.
The Solution: Ditch the Failover
Failover (also known as cutover) is the de facto recovery solution for dealing with IT disasters, but it contains inherent flaws that often prevent it from working at the very moment it is needed. Vast numbers of companies and other organizations in the U.S. and around the world rely on failover to keep them functioning in the event of their own disasters, be they failed server equipment or regional catastrophes.
ZeroNines has designed an Always Available™ business continuity architecture that does away with failover entirely. No backup systems ever need to kick in with only microseconds of notice. Instead, processing of all network transactions occurs continually, simultaneously, and equally in multiple locations. Long and short, Always Available prevents disasters like Twitter’s 90 minute outage on Wednesday. Instead of relying on a cutover event to succeed, Always Available simply continues running the same apps and data at one of two or three additional locations, with no interruption to the user.
A Parting Thought
Systems like Twitter’s, which rely on failover in the event of disasters, currently form the backbone of business and government information systems. Suppose that earthquake had happened somewhere in the U.S. (which one day it will) and knocked out data centers, communications, and other key infrastructure? If the failover systems fail like they failed Twitter (which they will), then what is the prospect for marshalling aid within our own borders? A scary thing to consider.
Visit the ZeroNines website to find out more about how our disaster-proof architecture can protect businesses (and government agencies) of any description from downtime.
Alan Gin – Founder & CEO, ZeroNines
That made positive headlines everywhere. But if the outage had occurred during the first critical hours or days of the relief effort, a self-righteous world would instead have sneered at Twitter for having failed, despite the fact that Twitter was never billed as a source of disaster relief. This is a window into an important reality: you’d better plan uptime into your system because you never know when you will be caught in the spotlight.
Background: Twitter Comes of Age
Twitter is an instant messaging system that allows short messages of up to 140 characters to be sent to a subscriber’s contacts, or be made available to the Twitter community at large. Millions of people use Twitter every day. Data reported in Wikipedia [source] shows that over 75% of the messages on Twitter are either “conversational” or “pointless babble.” A small but powerful percentage of messages are for far more serious purposes. Twitter was drafted into service for political campaigning, education, public relations, and emergencies long before the Haiti earthquake. But I see its Haiti relief efforts as the moment it came of age, when Twitter was first used to mobilize money on a mass worldwide scale for a focused, responsible, humanitarian purpose.
The Problem: A Failover Failure
On the morning of Wednesday January 20, 2010, Twitter became virtually inaccessible. According to Information Week, "A sudden failure coupled with problems in switching to a backup system produced a high number of errors for around 90 minutes" [source]. In other words, an unspecified failure in one place forced the system to rely on its “failover” architecture, which in turn failed. This is a classic failover failure.
The Cost: Hard to Quantify but Scary to Contemplate
Since Twitter service is free, there may be no direct cost to Twitter. Indirectly, this event contributes negatively to Twitter’s overall equation for obtaining venture capital, building a positive public image, and eventually making money off of paid services.
And here’s where we get into very uncertain territory. What would the cost have been if it had happened just a few days earlier? Would millions of dollars in aid to Haiti have been delayed or failed to materialize? Would people who were saved by this aid have died? Possibly. At the very least, Twitter would have experienced a PR storm far more serious than the January 20 outage caused.
The Solution: Ditch the Failover
Failover (also known as cutover) is the de facto recovery solution for dealing with IT disasters, but it contains inherent flaws that often prevent it from working at the very moment it is needed. Vast numbers of companies and other organizations in the U.S. and around the world rely on failover to keep them functioning in the event of their own disasters, be they failed server equipment or regional catastrophes.
ZeroNines has designed an Always Available™ business continuity architecture that does away with failover entirely. No backup systems ever need to kick in with only microseconds of notice. Instead, processing of all network transactions occurs continually, simultaneously, and equally in multiple locations. Long and short, Always Available prevents disasters like Twitter’s 90 minute outage on Wednesday. Instead of relying on a cutover event to succeed, Always Available simply continues running the same apps and data at one of two or three additional locations, with no interruption to the user.
A Parting Thought
Systems like Twitter’s, which rely on failover in the event of disasters, currently form the backbone of business and government information systems. Suppose that earthquake had happened somewhere in the U.S. (which one day it will) and knocked out data centers, communications, and other key infrastructure? If the failover systems fail like they failed Twitter (which they will), then what is the prospect for marshalling aid within our own borders? A scary thing to consider.
Visit the ZeroNines website to find out more about how our disaster-proof architecture can protect businesses (and government agencies) of any description from downtime.
Alan Gin – Founder & CEO, ZeroNines
Subscribe to:
Posts (Atom)