Michael Pecht and Edmond Elburn
Center for Advanced Life Cycle Engineering, University of Maryland, College Park, MD 20742, USA
The reliability of hard disk drives (HDDs) is dependent on the drive construction, as well as the operational and environmental conditions, in which the drive is used. Self-monitoring, analysis and reporting technology (SMART) continuously provides attribute information on HDD usage and degradation characteristics. This paper aims to analyze the reported failures Backblaze data set for ST3000DM001 HDDs intended for desktop applications within a data center application. SMART attributes used for predicting failure are discussed and analyzed over the life of many hard drives. A case study on the actual use of SMART and the limitations of the SMART attribute information, the data center’s information and the use of desktop drives in a commercial application are also presented. The analysis showed that when Backblaze started to record the data, the hard disk drives had already worked for a while with power on hours mean and standard deviation of 6,683 and 365 h, respectively. Therefore, it is possible that some SMART attributes have experienced critical values that have not been recorded by Backblaze. Additionally, 8% of all ST3000DM001 drives that Backblaze labeled as failed did not have raw values above zero for the five attributes that were considered critical. Backblaze recorded 25 SMART attributes in total for all hard disk drive brands where ST3000DM001 having 83.3% of the attributes ranked as the drive with the most attributes recorded. Having more recorded attributes with critical values leads to label more ST3000DM001 drives as failed while there might be the hard drives from the other brands or part numbers that experienced more critical SMART attributes but were not labeled as failed because of the lack of records.