Data Loss and Hard Drive Failure: Understanding the Causes and Costs


Commissioned by:
DeepSpar Data Recovery Systems
http://www.deepspar.com

Written by:
David M. Smith
Associate Professor of Economics
Graziadio School of Business
Pepperdine University
david.smith@pepperdine.edu

Michael L. Williams
Assistant Professor of Information Systems
Graziadio School of Business
Pepperdine University
michael.williams@pepperdine.edu


Read next:
How to Diagnose and Recover the Most Common Cases

Abstract


Hard drive failure is an inescapable reality in the modern business world. Whether due to human error, software corruption or other causes most firms will face incidents of lost data through hard drive failure. In this paper we analyze the various causes of hard drive failure and estimate the costs of each incident. Our calculations indicate that on average a single data loss incident will cost an organization $2,900, the majority of which is measured as lost productivity. Finally, we offer seven suggestions for responding to hard drive failures.

Results of Data Loss


Data loss and computer downtime have serious implications for business. Lost data can lead to costly downtime for sales and marketing and reduced customer service while customer databases are restored or rebuilt. Lost financial data can lead to lost contracts and stock value, or worse. A recent study by Datamonitor found that as many as one-third of IT decision-makers believes that a major data loss incident at their firm could lead to bankruptcy (Datamonitor, 2007). Small businesses may be more vulnerable, according to a recent survey by Verio. Seventy-percent of small businesses reported that a single incident of data loss would be considered significant and costly. These concerns are well grounded as over one-half of the respondents have already experienced some data loss (Verio 2007). Although hard drive manufacturers claim less than a 1% failure rate, recent research by computer scientists at Carnegie Mellon University found that a 2%-4% failure rate is more common and under some conditions the failure rate may reach as high as 13% (Schroeder & Gibson, 2007).

The cost of lost data varies depending upon its application, as well as the potential value that can be captured from use of the data. In addition, there is a cost associated with recovering the data, as well as lost productivity due to computer downtime. Using available data and existing research, this paper attempts to quantify the costs associated with episodes of data loss, considering the costs of recovery, as well as lost sales and reduced productivity.


Growth of Stored Data


The amount of data stored within corporate servers, workstations and user's machines continues to grow. The dramatic growth of stored data is fueled by three trends: the increasing capacity of storage media, decreasing costs per MB, and emerging role of information technology within firms.

In 1965 Gordon Moore, a co-founder of Intel, estimated that the pace of technology would support a doubling of the number of transistors on a single piece of silicon every 18-24 months for about the same cost. This estimate has largely held true (see Figure 1).

Figure 1: Moore's Law from 1970 - 2005 (Intel)

While Moore's prediction dealt with the number of transistors on a piece of silicon, it has implications for the growth and costs of data storage as well. The reality of "Moore's Law" implies similar outcomes in the size of magnetic disk storage (see Figure 2).


Figure 2: Hard drive capacity growth from 1995 - 2006 (IBM)

A second trend affecting the growth of data storage is that the cost per megabyte has plummeted over the past twenty years (see Figure 3). This trend is also related to Moore’s law.

Figure 3: Costs per Megabyte of digital storage continue to fall (source: PC Magazine, Oct 2, 2007)

A third trend in the rapid growth of stored data is the emerging role of information technology within firms. In addition to the historical responsibility for storing transactions and financial information many firms are moving their information resources out of the back office and closer to the front line with Customer Relationship Management (CRM) and Supply Chain Management (SCM) systems. While these systems create advantages for firms, they are data intensive and dramatically increase a firm's storage requirements. Some industries--for instance hospitals and financial services--are quickly moving towards paperless environments where all corporate information is stored and transmitted electronically (Golinkin, 2007). The trend from back-office automation towards front-office, value-added and ubiquitous computing requires extensive data storage.

These trends cause many firms to invest in additional storage. The research firm IDC reports that the total disk storage capacity sold in the first quarter of 2006 topped 1,030 petabytes, up more than 48 percent over the year before (one petabyte is equal to one quadrillion bytes of data, or 1,000,000 GB). According to their research the total disk storage market grew by over 6% in 2006 to $24.4 billion. Over the same period, the mid-priced products in this market experienced double-digit growth through sales to small and medium sized businesses (Regan, 2007). According to Gartner, an IT research and consulting firm, many firms are seeing the costs of their data backup and archiving far outpace growth in their overall budgets (Buchanon, 2007). These trends will likely continue or increase into the foreseeable future.


PCs in Use


As noted above, companies are relying more and more on data in a distributed environment. This trend is likely to continue, as evidenced by a 2006 survey of CIOs by Gartner that revealed “increased mobility” as a high priority item for chief information officers. There are over 230 million PCs in use in the United States (Computer Industry Almanac, 2006). In the US, over 50 percent of workforce, or 77 million individuals, use a computer at work (Bureau of Labor Statistics, 2005). And increasingly the PCs are likely to be laptops, with research firm IDC estimating that in 2011 laptops will outsell desktops (BBC News, 2007).


Episodes of Data Loss


Survey data from companies that specialize in data recovery may be used to investigate the primary causes for how data actually gets lost (Harris, 2007). Hard drive failure is the most common cause of data loss, accounting for 38 percent of data loss incidents. Drive read instability includes occasions where media corruption or degradation prevents access to the data on a disk. This explains 30% of lost data. Software corruption, which might include damages caused by system software or other program (e.g., a virus attack), accounts for 13 percent of data loss incidents. Human error accounts for 12 percent of data loss episodes. This includes the accidental deletion of data as well as incorrectly partitioning the hard drive. The relative magnitudes of the different types of data loss are illustrated in Figure 4. (This analysis ignores data loss due to theft, an increasing problem given the growth in use of laptops).

Figure 4: Causes of Data Loss (source: a survey of 50 data recovery firms across 14 countries, DeepSpar Data Recovery Systems)

The Cost of a Data Loss Incident


An episode of data loss will result in two outcomes: either the data is recoverable or is permanently lost. In today’s environment, with numerous backup and recovery solutions, businesses need not suffer episodes of irretrievable data, except in case of careless planning or major disaster. We will assume for the purposes of this paper that all data may be retrieved. However, as demonstrated in prior work (Smith, 2003) the inherent value of data can be significant, and as noted earlier, if data is permanently lost it could bankrupt many organizations. Setting this possibility aside, this study will instead focus on the costs of retrieving data, as well as lost productivity and sales during an episode of computer downtime.

In approximately 40% of cases--when there has been no physical damage to the hard drive--data may be retrieved by an in-house technical support person. These cases are often caused by human error, software corruption, or computer viruses. We offer advice for restoring data in these cases below.

In the case of hard drive failures and instability episodes, which make up approximately 70% of severe data loss incidents, internal recovery efforts are not advised, and outside expertise should be sought. The cost of data recovery can vary widely, depending on numerous factors, including the size and type of storage media, severity of damage, parts required, and urgency. The authors conducted an independent survey, by polling eight separate data recovery companies on the estimated cost of recovering a 160GB desktop hard drive. Price estimates varied from $300 to $3,900 depending on the vendor along with the factors noted above. The highest prices were reserved for highly time sensitive (1-2 days) data recoveries. For standard recoveries, the majority of estimates fell between the range of $500 - $2,500. By taking the midpoint of this range as a reasonable approximation, the average cost of recovering data on a 160GB hard drive would cost $1,500.

In cases where the data may be retrieved with in-house expertise, we must consider the internal resources that are encumbered. If there is a computer support specialist employed within the company, both the number of hours needed to recover the data and the cost of employing this individual must be taken into account. The Bureau of Labor Statistics reports that the average computer support specialist currently earns $44.60 an hour, including salary and benefits. Assuming that the average time needed to recover lost data is approximately 8 hours, the cost of using an in house employee to recover lost data is approximately $350.

Taking into account the expected probability of whether the data can be retrieved in-house or would need to be sent to data recovery specialists, the expected average cost to retrieve data is calculated as $1,150. Figure 5 summarizes the cost incurred in order to recover the lost data.


Figure 5: Cost of Recovery from a data loss episode

In addition to the cost of outsourcing the recovering data, users and companies are subjected to lost productivity. Every computer user has experienced a time of frustration—and corresponding lost productivity--when their computer is unavailable for use. When data loss occurs, these episodes can sometimes become protracted, and can become quite costly. In order to estimate the costs, the following factors must be considered: the individual user’s productivity, the length of the downtime, and the extent to which an individual’s data loss episode affects others in the organization.

During the time in which the attempt to recover data is underway, an individual is unable to access his or her PC, thereby reducing productivity, which in turn impacts company sales and profitability. This opportunity cost - lost productivity due to computer downtime - impacts a company’s income statement just as other more common and explicit costs. By what mechanism does this impact the bottom line? Some employees are directly involved in sales and revenue production; others are involved in more supportive or indirect roles. Economic theory says that each employee’s productivity, or contribution to firm revenue, can be approximated using the individual’s compensation. Available data sources suggest that individuals who use computers at work earn an average of $46.48 an hour in wages and benefits. The time needed to recover data may vary greatly from one hour to several days. In addition, most workers won’t have their productivity reduced to zero, as they could perform other tasks that do not require their computer. We will assume a productivity slowdown of 50 percent.

Costs begin to mount when considering the “contamination effect”: when one individual’s computer downtime affects others within an organization. The IT Department may have to be involved, and in work environments that are collaborative, productivity slowdowns may impact many others within the organization. The slowdowns will depend on the level and nature of collaboration. In a related scenario, when a computer network is down, others have estimated that costs may run into millions of dollars for each hour of downtime (Patterson, 2002).

Precision in estimating the contamination effect will depend on the factors noted above, but a conservative estimate for a typical data loss episode might suggest that an individual’s inability to access key data would impact 3 other co-workers’ productivity, and reduce their productivity by 25 percent each.

The total loss due to productivity slowdown depends critically on the length of downtime, which will be determined to a significant manner on whether the computer needs to be sent to an outside firm, or whether the data can be retrieved in-house. For outside recoveries, the authors’ survey of data recovery firms suggests a 5-day turnaround would likely serve as the minimum amount of time needed for a standard hard drive recovery, including time needed for transport. For in-house recoveries, 8 hours would appear a reasonable estimate of time needed for recovery.

Taking into account these various factors—whether the data is recovered onsite or not, the length of the recovery period, and the expected “contamination effect,” the average estimated productivity loss due to an episode of data loss is $1,750. Figure 6 summarizes this expected productivity loss.


Figure 6: Lost productivity due to an episode of data loss

Adding together the expected cost of data recovery ($1,150) to the expected loss of productivity ($1,750), we calculate an average cost of a data loss episode as $2,900. Once again, this assumes that the data is retrievable. If data is lost on a permanent basis, this estimate would grow significantly, as shown in Smith (2003). Note that productivity losses dominate the costs of an episode of lost data.


Understanding Hard Drive Failure


Text Box: Adding together the expected cost of data recovery ($1,150)  to the expected productivity loss ($1,750), we calculate the average cost of a data loss episode as $2,900.

Understanding hard drive failure is important because it is the largest single explanation for data loss. Hard drive failure may be related to mechanical, electronic, or firmware failures. Mechanical failures occur when physical components of the device itself begin to wear or malfunction. Electronic failures occur when the printed circuit board (PCB) begins to produce errors. Finally, many hard drive failures are related to out-of-date, corrupt or buggy firmware. Firmware is the controlling software that is built into the hardware device itself stored on disk platters of the drive. Like most software in use today, firmware may become damaged or corrupt over time. This is a very common failure for modern drives because of the complexity of firmware design.


Is "Lost Data" Really Lost?


Most instances of hard drive failure do not destroy all of the data on the disk and much of the data on failed drives is often recoverable. Both consumer applications and professional data recovery tools and services are available to recover lost data. Which alternative to choose depends upon the value of the lost data. The more valuable the data on the failed drive the fewer non-professional recovery attempts should be made. Non-professional tools and system software (e.g., chkdsk) often fix errors by overwriting the file system on the drive. Though this may repair the file system, it permanently destroys the data. Disks with highly valuable data should be sent to a professional data recovery service. A recent survey of 50 data recovery firms across 14 countries found that 15% of all non-recoverable data loss situations were created by prior non-professional data recovery attempts (DeepSpar Data Recovery Systems, 2007).

The most thorough professional data recovery services are able to retrieve data from drives with mechanical, electronic, and firmware failures. Comprehensive professional efforts include drive restoration, disk imaging, and data retrieval (DeepSpar, 3D Data Recovery Process). First, during the drive restoration phase any existing damage on the hard drive is repaired. This includes mechanical problems such as failed heads, electronic problems such as failed PCBs, and firmware issues. These repairs are made by replacing individual drive components with donor parts and fixing firmware. A second phase is disk imaging where the contents of the drive are retrieved, e.g. retrieving bad sectors or handling other read instability issues, and copied to a healthy drive to reduce the probability of further data loss on the original drive. Finally, the data is retrieved from the new healthy drive. During this phase the drive’s file system is restored, all files are verified for integrity and repaired. It should be noted that many professional data recovery services focus almost exclusively on data retrieval. However, without adequate attention to drive restoration and disk imaging any data retrieval effort will likely encounter serious challenges and may lead to further drive degradation and data loss.


What to Do When you Experience Hard Drive Failure


As evidenced above, hard drive failure is the most common source of data loss which can lead to negative consequences for any business. Unfortunately, hard drive failure is inevitable. It is not a question of if a firm's hard drives will fail, but when. However, with proper planning and a strategic response hard drive failure does not have to lead to data loss. Below we offer seven recommended strategies for dealing with hard drive failure.

  1. If the failed drive is the system boot disk, immediately unplug the computer and remove the drive. Do not attempt to reboot from this drive. Depending on the nature of the data stored on this drive, one may wish to make an initial attempt at data recovery before sending the drive to a professional data recovery service. However, if the data on the drive is mission critical, we recommend immediately contacting a professional data recovery service.

  2. If the disk does not contain mission-critical information, one may attempt to retrieve the data without the assistance of a professional data recovery service. However, do not execute any system software, such as chkdsk to repair the file system unless you can afford to lose the data completely. System software is intended to repair a disk's file system, not to recover data. These tools will most likely overwrite lost data.

  3. If the failure is definitely related to the file system (e.g., deleted files, OS failure, or virus attack...), and not a physical, electronic, or firmware failure, data retrieval software may be able to recover the disk's data. We recommend first installing the drive into an external USB enclosure for this process to reduce the disk utilization during boot up. Once the drive is recognized by the Operating System, immediately begin the data retrieval. Do not save any files to the target disk or install programs as this would likely overwrite lost data. Though this strategy is often successful, there is a chance the OS will overwrite some lost data while updating or writing system files to the drive thus resulting in data loss. Therefore we do not recommend applying this recovery method on drives containing mission-critical information. If this strategy does not recover your data you should contact a professional data recovery service for assistance.

  4. Never open a drive case. It may only be opened in a clean-room environment. Any other exposure will eventually result in the physical destruction of the disk's magnetic layer and the complete loss of data.

  5. Never attempt to swap PCBs from a healthy drive to a failed drive. Modern hard drives are manufactured with unique configuration parameters based on the tolerances of the individual components at the disk's manufacture. Applying PCB/ROM to a disk for which it was not manufactured may destroy the drive and make it non-repairable. This is true even if the two drives share a common manufacturer, model, and manufacturing date.

  6. Do not attempt to "repair" bad sectors or to read data from bad sectors by using data retrieval software on a failing disk. Doing so will either overwrite the underlying data or result in data loss. Note: as described above professional imaging tools retrieve bad sectors to a healthy disk rather than repairing or skipping bad sectors from a failing disk.

  7. In cases of water, fire, or vandalism damage, do not attempt to power up a system that contains critical data. Doing so may destroy the disk's magnetic layer and cause the data to be non-recoverable.

5. Conclusion


In this paper we have analyzed the most common causes of data loss, estimated the average cost per incident, and suggested several strategies for responding to a hard drive failure. In summary, every firm will face the problem of hard drive failure. We argue that as data storage costs decrease and the role of information technology in modern firms increase these problems will become more prevalent. Using the strategies described above, however, a firm should be able to recover data from most hard drive failures through either internal support services or external data restoration services.


References


  1. BBC News, “Laptops set to out sell desktops,” March 21, (2007).
  2. Stewart Buchanon, “Five Ways to Manage Storage Assets and Defuse Explosive Growth,” Gartner Research Report #G00149222 (2007).
  3. Bureau of Labor Statistics, “Computer and Internet Use At Work,” (2005).
  4. Bureau of Labor Statistics, “Employer Costs for Employee Compensation,” June (2007).
  5. DeepSpar Data Recovery Systems, “DeepSpar 3D Data Recovery Process” .
  6. Computer Industry Almanac, “Computers In-Use by Country,” (2006).
  7. Datamonitor, “McAfee Expands Data Loss Prevention Solution,” September 19, (2007).
  8. DeepSpar Data Recovery Systems, “A Survey of Data Recovery Issues & Causes,” Unpublished working paper, (2007).
  9. Gartner Group, “CIO Survey Shows the Continuing Importance of Mobile Applications,” October 30, (2006).
  10. Web Golinkin, “Health Care When You Want It,” Wall Street Journal, August 2, (2007).
  11. Robin Harris, “How Data Gets Lost,”ZD Net Storage Bits, August 6, (2007).
  12. IBM, “Hard drive capacity growth from 1995 – 2006”.
  13. Intel, “Moore's Law”.
  14. PC Magazine, “Tech Tracker: Storage: From Highway Robbery to Runaway Bargain, ” p. 21 Oct 2, (2007).
  15. David Patterson, “A Simple way to Estimate the Cost of Downtime,” Proceedings of LISA ’02: Sixteenth Systems Administration Conference, pp. 185-88, (2002).
  16. Keith Regan, “Mounting Data Spurs Corporate Storage Spending,” E-Commerce Times, March 13, (2007).
  17. Bianca Schroeder, Bianca and Garth A. Gibson, “Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?” Proceedings of the 5th USENIX Conference on File and Storage Technologies, Pp. 1-16 (2007).
  18. David Smith, “The Cost of Lost Data,” Graziadio Business Report, Vol. 3, (2003).
  19. Verio, “Verio Survey Reveals Small Business Data Left Vulnerable to Loss,” (2007).


To learn more about the subject:


Read How to Diagnose and Recover the Most Common Cases.

©2004-2013 ACE Data Recovery Engineering Inc.
DeepSpar, DeepSpar Disk Imager, 3D Data Recovery, and all associated designs are trademarks of ACE Data Recovery Engineering Inc.  PC-3000 and Data Extractor are products of ACE LaboratoryRussia, sold under contract in North Americaby ACE Data Recovery Engineering Inc. under the DeepSpar brand.