Sound Data Management Training » History » Version 87
Version 86 (Steve Welburn, 2012-09-25 02:13 PM) → Version 87/110 (Steve Welburn, 2012-09-25 02:14 PM)
h1. WP1.2 Online Training Material
{{>toc}}
We consider three stages of a research project, and the appropriate research data management considerations for each of those stages. The stages are:
* [[Before The Research|before the research]];
* [[During The Research|during the research]];
* [[At The End Of The Research|at the end of the research]].
In addition, we consider the [[Research Management|responsibilities of a Principal Investigator]] regarding data management.
{{include(Data Management By Research Stage)}}
h2. Why manage research data ?
Funder requirements: http://researchonline.lshtm.ac.uk/208596/
Ponemon reports for Intel on the "Lost Laptop problem" ~10% of Education and Research laptops are lost during their lifetime.
PC World study on laptop failure rates: 20-30% of laptops with a significant failure
h3. Failure Trends In A Large Disk Drive Population
Identified ~20% of hard drives being replaced over 3 years as a result of a repair being required! years!
FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population
Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report.
bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade.
~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years
NB: Variation with model and manufacturer!
In the first 6 months, the risk of failure is highest for low & high utilisation!
* ~10% for high utilisation in the first 3 months
* for 3-year old drives ~4-5% chance of failure whatever the utilisation
* failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C
* drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?)
Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error.
* If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months
* If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months
* If you have more than 1 scan error on a drive, it's significantly less likely to survive
* Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months
* ...but over 36% of failed drives had zero counts on all variables
bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors.
Hard drive manufacturers often quote yearly failure rates below 2% [2]
User studies have seen rates as high as 6% [9]
Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7]
Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11]
Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].
h2. Overarching concerns
Human participation - ethics, data protection
Audio data - copyright
Storage - where ? how ? SLA ?
Short-term resilient storage for work-in-progress
Long-term archival storage for research data outputs
Curation of archived data - refreshing media and formats
Drivers - FoI, RCUK
{{>toc}}
We consider three stages of a research project, and the appropriate research data management considerations for each of those stages. The stages are:
* [[Before The Research|before the research]];
* [[During The Research|during the research]];
* [[At The End Of The Research|at the end of the research]].
In addition, we consider the [[Research Management|responsibilities of a Principal Investigator]] regarding data management.
{{include(Data Management By Research Stage)}}
h2. Why manage research data ?
Funder requirements: http://researchonline.lshtm.ac.uk/208596/
Ponemon reports for Intel on the "Lost Laptop problem" ~10% of Education and Research laptops are lost during their lifetime.
PC World study on laptop failure rates: 20-30% of laptops with a significant failure
h3. Failure Trends In A Large Disk Drive Population
Identified ~20% of hard drives being replaced over 3 years as a result of a repair being required! years!
FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population
Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report.
bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade.
~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years
NB: Variation with model and manufacturer!
In the first 6 months, the risk of failure is highest for low & high utilisation!
* ~10% for high utilisation in the first 3 months
* for 3-year old drives ~4-5% chance of failure whatever the utilisation
* failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C
* drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?)
Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error.
* If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months
* If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months
* If you have more than 1 scan error on a drive, it's significantly less likely to survive
* Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months
* ...but over 36% of failed drives had zero counts on all variables
bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors.
Hard drive manufacturers often quote yearly failure rates below 2% [2]
User studies have seen rates as high as 6% [9]
Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7]
Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11]
Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].
h2. Overarching concerns
Human participation - ethics, data protection
Audio data - copyright
Storage - where ? how ? SLA ?
Short-term resilient storage for work-in-progress
Long-term archival storage for research data outputs
Curation of archived data - refreshing media and formats
Drivers - FoI, RCUK