On April 12, 2011, the National Agricultural Cooperative Federation, aka Nonghyup or NH Bank, experienced a system-wide crash that halted all of its banking transactions and prevented 30 million customers from accessing ATM’s and banking services for a week. The biggest problem with the system was with credit card transactions. Between April 12th, when the server crashed, and April 18th, customers made 73,500 transactions worth 57.8 billion won ($53 million)... Nonghyup [NH Bank] said roughly 5 percent of the data on credit card transactions was lost after the servers were forced to shut down. It expected a full recovery of the lost data by April 22nd. As of my visit to Seoul last week this “full recovery” had not occurred.
The Culprit: “The laptop of an IBM worker ordered the deletion of execution files of our key systems, which involved more than one hundred IBM servers. This generated the service failure,’’ a Nonghyup official said. This laptop also apparently ordered the deletion of data on IBM DR servers bringing into question whether this was human error or a planned attack.
Regardless of the source or reason for the lost data, we cannot ignore the simple fact that any BC/DR plan must include the secure and safe storage of backups or snapshots in an offsite location. This is what we call a BACKUP.
What I hear, far too often, is the common mistake that a fault tolerant system, either based on RAID or a server mirroring strategy, is mistakenly considered to be a Disaster Recovery solution. This could not be further from the truth.
It is wise to have a fail-over fail-back strategy for Tier-1 Mission Critical data. We support this approach in theory and in our solutions. It is not however a replacement for regular backups, which should be present both on-site, for rapid-recovery, and off-site, for recovery from more significant disasters.
If you are concerned about the quality and security of your BC/DR plan ask the following questions:
- Can we recover quickly, in less than an hour, from a local server failure or data loss?
- Can we rapidly resume IT operations in the event of a primary data center loss or failure?
- Can we roll back in time to recover data prior to an infection or intrusion?
These simple questions can help to highlight some of the most common shortcomings of BC/DR plans. Think about how you would feel if your company suffered the same fate as NH Bank and then think about how you can avoid it and be the hero instead of the villain.
If we can help you work through these challenges for your company, please do not hesitate to contact us.
Sources: