Evidence Promoting Good Data Management » History » Version 22

Steve Welburn, 2012-11-12 02:13 PM

1 1 Steve Welburn
h1. Evidence Promoting Good Data Management
2 1 Steve Welburn
3 18 Steve Welburn
{{>toc}}
4 18 Steve Welburn
5 20 Steve Welburn
If you have any additional examples that you would like to share, please email them to: rdm.c4dm at gmail.com
6 11 Steve Welburn
7 1 Steve Welburn
h2. Anecdotal Tales Of Lost Data
8 1 Steve Welburn
9 4 Steve Welburn
h3. Recovery of Overwritten Hard Disk Data
10 5 Steve Welburn
11 5 Steve Welburn
5 October 2005 Linux Forums - http://tinyurl.com/8t7uaop
12 5 Steve Welburn
13 3 Steve Welburn
<pre>
14 21 Steve Welburn
Hi, a friend of mine just overwrote two months of her
15 21 Steve Welburn
PhD thesis with an older version. I know recovery of
16 21 Steve Welburn
overwritten data is possible, but wonder if I'd need
17 21 Steve Welburn
special hardware to do it. Does anyone know something
18 21 Steve Welburn
about this ?
19 21 Steve Welburn
20 2 Steve Welburn
Thank You.
21 3 Steve Welburn
</pre>
22 1 Steve Welburn
23 5 Steve Welburn
24 1 Steve Welburn
h3. Stolen laptop had PhD research
25 5 Steve Welburn
26 5 Steve Welburn
19 March 2008 Surrey Leader - http://tinyurl.com/9hmtlv4
27 5 Steve Welburn
28 3 Steve Welburn
<pre>
29 2 Steve Welburn
Thirty-fve minutes spent in Langley’s Willowbrook Shopping Centre cost
30 2 Steve Welburn
a Surrey woman much more than she had anticipated.
31 2 Steve Welburn
Langley RCMP say that while she was shopping from 1-1:35 p.m. last
32 2 Steve Welburn
Monday, someone broke into her vehicle and stole a number of items,
33 1 Steve Welburn
including a Mac iBook laptop containing the research she had compiled
34 2 Steve Welburn
as she worked towards her PhD.
35 2 Steve Welburn
“All that information was on that computer and she has no back-up
36 2 Steve Welburn
fle,” said Langley RCMP spokesman Cpl. Brenda Marshall.
37 3 Steve Welburn
</pre>
38 6 Steve Welburn
39 6 Steve Welburn
h3. Happiness is the return of a stolen computer, with data intact
40 6 Steve Welburn
41 6 Steve Welburn
27 May 2010 The Press, NZ - http://tinyurl.com/38sznnh
42 6 Steve Welburn
43 6 Steve Welburn
<pre>
44 6 Steve Welburn
Never has a man been so happy to see a computer full of data
45 6 Steve Welburn
spreadsheets.
46 6 Steve Welburn
47 6 Steve Welburn
Claudio De Sassi's world fell apart when a car containing almost three
48 6 Steve Welburn
years work towards his PhD was stolen two weeks ago.
49 6 Steve Welburn
De Sassi, a Canterbury University academic, could not hide his joy
50 6 Steve Welburn
yesterday as police reunited him with his stolen laptop and backpack.
51 6 Steve Welburn
</pre>
52 6 Steve Welburn
53 19 Steve Welburn
h3. Thugs steal Christmas, doctoral dreams
54 8 Steve Welburn
55 8 Steve Welburn
22 December 2010 KRQE - http://tinyurl.com/9a5j56f
56 8 Steve Welburn
57 8 Steve Welburn
<pre>
58 8 Steve Welburn
A tiny television sits where a big screen used to, and a Christmas tree
59 8 Steve Welburn
stands with little underneath it...
60 8 Steve Welburn
61 8 Steve Welburn
Even worse than the gifts, the crooks stole a MacBook Pro laptop and a
62 8 Steve Welburn
LaCie hard drive.
63 8 Steve Welburn
64 8 Steve Welburn
The hard drive had … her dissertation and nearly seven years of
65 8 Steve Welburn
research for her doctoral degree she was set to fnish in a few weeks.
66 8 Steve Welburn
Osuna had everything backed up on a separate hard drive in a safe, but
67 8 Steve Welburn
burglars made off with that too.
68 8 Steve Welburn
69 8 Steve Welburn
"All I could think about is that all that time is gone, all that effort,
70 8 Steve Welburn
everything is gone," Osuna said.
71 9 Steve Welburn
</pre>
72 8 Steve Welburn
73 8 Steve Welburn
74 8 Steve Welburn
h3. Laptop Stolen From OSU Doctoral Student
75 8 Steve Welburn
76 8 Steve Welburn
NBC4i January 06 2011 - http://tinyurl.com/bmybv9x
77 8 Steve Welburn
78 8 Steve Welburn
<pre>
79 8 Steve Welburn
...her car was broken into and her chrome Mac book pro was stolen.
80 8 Steve Welburn
She has a back-up for all but the last six months of research, but the
81 8 Steve Welburn
most important part of the research had happened recently.
82 8 Steve Welburn
</pre>
83 8 Steve Welburn
84 6 Steve Welburn
h2. The Lost Laptop Problem
85 6 Steve Welburn
86 6 Steve Welburn
* 2010 Ponemon Institute report for Intel re. US laptops
87 6 Steve Welburn
** On average, 2.3% of laptops assigned to employees are lost each year
88 6 Steve Welburn
** In education & research that rises to 3.7%, with 10.8% of laptops being lost before the end of their useful life (~3 years i.e. within 1 PhD of allocation!
89 6 Steve Welburn
** 75% lost outside the workplace
90 6 Steve Welburn
* Very similar results from 2011 European report!
91 6 Steve Welburn
92 6 Steve Welburn
Intel 2010 - http://tinyurl.com/8c9m4bn
93 7 Steve Welburn
94 7 Steve Welburn
h2. Laptop Reliability
95 7 Steve Welburn
96 7 Steve Welburn
* 2011 PC World Laptop Reliability Survey from 63,000 readers:
97 7 Steve Welburn
** 22.6% had signifcant problems during the product's lifetime
98 7 Steve Welburn
** Of which...
99 7 Steve Welburn
*** 19% had OS problems ~1 in 25 of all laptops
100 7 Steve Welburn
*** 18% had HDD problems ~1 in 25 of all laptops
101 7 Steve Welburn
*** 10% PSU problems ~1 in 50 of all laptops
102 7 Steve Welburn
103 7 Steve Welburn
PC World 2011 - http://tinyurl.com/876qza5
104 8 Steve Welburn
105 8 Steve Welburn
h2. Hard Disk Failures
106 8 Steve Welburn
107 8 Steve Welburn
* Failure Trends In A Large Disk Drive Population
108 8 Steve Welburn
** Usenix conference on File and Storage Technologies 2007 (FAST '07)
109 8 Steve Welburn
** Eduardo Pinheiro & Wolf-Dietrich Weber, Google Inc.
110 8 Steve Welburn
* Data collected from over 100,000 disk drives at Google
111 8 Steve Welburn
* As part of repairs procedures:
112 8 Steve Welburn
** ~13% of disk drives replaced over 3 years
113 8 Steve Welburn
** ~20% of disk drives replaced over 4 years
114 8 Steve Welburn
115 8 Steve Welburn
Article: http://tinyurl.com/octz6b
116 8 Steve Welburn
117 8 Steve Welburn
h2. Data management in the cloud
118 8 Steve Welburn
119 8 Steve Welburn
See JISC/DCC document "Curation In The Cloud" - http://tinyurl.com/8nogtmv
120 8 Steve Welburn
121 8 Steve Welburn
Service agreements may give wide-ranging rights to the data service.
122 8 Steve Welburn
123 8 Steve Welburn
h3. Google Terms Of Service
124 8 Steve Welburn
125 8 Steve Welburn
1 March 2012 Google Terms of Service : http://tinyurl.com/89dc9fa
126 8 Steve Welburn
127 8 Steve Welburn
<pre>
128 8 Steve Welburn
When you upload or otherwise submit content to our Services, you give
129 8 Steve Welburn
Google (and those we work with) a worldwide license to use, host, store,
130 8 Steve Welburn
reproduce, modify, create derivative works (such as those resulting from
131 8 Steve Welburn
translations, adaptations or other changes we make so that your
132 8 Steve Welburn
content works better with our Services), communicate, publish, publicly
133 8 Steve Welburn
perform, publicly display and distribute such content. The rights you
134 8 Steve Welburn
grant in this license are for the limited purpose of operating, promoting,
135 8 Steve Welburn
and improving our Services, and to develop new ones. This license
136 8 Steve Welburn
continues even if you stop using our Services (for example, for a
137 8 Steve Welburn
business listing you have added to Google Maps).
138 8 Steve Welburn
</pre>
139 8 Steve Welburn
140 8 Steve Welburn
h3. Microsoft Services Agreement
141 8 Steve Welburn
142 10 Steve Welburn
19 October 2012 Microsoft services agreement : http://tinyurl.com/8e4kucy
143 8 Steve Welburn
144 8 Steve Welburn
<pre>
145 8 Steve Welburn
When you upload your content to the services, you agree that it may
146 8 Steve Welburn
be used, modifed, adapted, saved, reproduced, distributed, and
147 8 Steve Welburn
displayed to the extent necessary to protect you and to provide, protect
148 8 Steve Welburn
and improve Microsoft products and services. For example, we may
149 8 Steve Welburn
occasionally use automated means to isolate information from email,
150 8 Steve Welburn
chats, or photos in order to help detect and protect against spam and
151 8 Steve Welburn
malware, or to improve the services with new features that makes them
152 8 Steve Welburn
easier to use. When processing your content, Microsoft takes steps to
153 8 Steve Welburn
help preserve your privacy.
154 8 Steve Welburn
</pre>
155 8 Steve Welburn
156 8 Steve Welburn
h2. Archiving Data
157 8 Steve Welburn
158 8 Steve Welburn
h3. BBC Domesday Project
159 8 Steve Welburn
160 8 Steve Welburn
1986 Project to do a modern-day Domesday book (early crowd-sourcing)
161 8 Steve Welburn
* Used “BBC Master” computers with data on laserdisc
162 8 Steve Welburn
* Collected 147,819 pages of text and 23,225 photos
163 8 Steve Welburn
* Media expiring and obsolete technology put the data at risk!
164 8 Steve Welburn
165 8 Steve Welburn
Domesday Reloaded (2011)
166 8 Steve Welburn
* Required emulation of software
167 8 Steve Welburn
* Images restored from original masters
168 8 Steve Welburn
* http://www.bbc.co.uk/history/domesday
169 8 Steve Welburn
170 8 Steve Welburn
To allow long-term access to data
171 8 Steve Welburn
* Don't use obscure formats!
172 8 Steve Welburn
* Don't use obscure media!
173 8 Steve Welburn
* Don't rely on technology being available!
174 8 Steve Welburn
* Do keep original source material!
175 12 Steve Welburn
176 15 Steve Welburn
Google images for "BBC Domesday":https://www.google.co.uk/search?tbm=isch&q=bbc+domesday
177 12 Steve Welburn
178 12 Steve Welburn
h2. Related Media
179 12 Steve Welburn
180 12 Steve Welburn
h3. Disk Drives Break
181 12 Steve Welburn
182 12 Steve Welburn
"DataCent collection of disk drive failure sounds":http://datacent.com/hard_drive_sounds.php
183 12 Steve Welburn
184 12 Steve Welburn
h3. Buildings burn down
185 12 Steve Welburn
186 12 Steve Welburn
"Southampton University Mountbatten Building Fire":http://www.flickr.com/search/?q=Southampton%20University%20Mountbatten%20Building%20Fire
187 12 Steve Welburn
188 12 Steve Welburn
h3. Laptops Break / Get Broken
189 12 Steve Welburn
190 13 Steve Welburn
* "Shot laptop":http://lilysussman.wordpress.com/tag/laptop-destroyed/
191 13 Steve Welburn
* "Google images of broken laptops":https://www.google.co.uk/search?q=broken%20laptop&um=1&tbm=isch
192 22 Steve Welburn
193 22 Steve Welburn
194 22 Steve Welburn
195 22 Steve Welburn
196 22 Steve Welburn
h1. xxxxxx
197 22 Steve Welburn
198 22 Steve Welburn
199 22 Steve Welburn
Ponemon reports for Intel on the "Lost Laptop problem" ~10% of Education and Research laptops are lost during their lifetime.
200 22 Steve Welburn
201 22 Steve Welburn
PC World study on laptop failure rates: 20-30% of laptops with a significant failure
202 22 Steve Welburn
203 22 Steve Welburn
h3. Failure Trends In A Large Disk Drive Population
204 22 Steve Welburn
205 22 Steve Welburn
Identified ~13% of hard drives being replaced over 3 years, 20% over 4 years as a result of a repair being required!
206 22 Steve Welburn
207 22 Steve Welburn
FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population
208 22 Steve Welburn
209 22 Steve Welburn
Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report. Also, discs were largely use in servers resulting in (relatively) large hours used relative to desktop / laptop computers.
210 22 Steve Welburn
211 22 Steve Welburn
bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade.
212 22 Steve Welburn
213 22 Steve Welburn
~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years
214 22 Steve Welburn
215 22 Steve Welburn
NB: Variation with model and manufacturer!
216 22 Steve Welburn
217 22 Steve Welburn
In the first 6 months, the risk of failure is highest for low & high utilisation!
218 22 Steve Welburn
* ~10% for high utilisation in the first 3 months
219 22 Steve Welburn
* for 3-year old drives ~4-5% chance of failure whatever the utilisation
220 22 Steve Welburn
* failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C
221 22 Steve Welburn
* drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?)
222 22 Steve Welburn
223 22 Steve Welburn
Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error.
224 22 Steve Welburn
* If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months
225 22 Steve Welburn
* If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months
226 22 Steve Welburn
* If you have more than 1 scan error on a drive, it's significantly less likely to survive
227 22 Steve Welburn
* Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months
228 22 Steve Welburn
* ...but over 36% of failed drives had zero counts on all variables
229 22 Steve Welburn
230 22 Steve Welburn
bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors.
231 22 Steve Welburn
232 22 Steve Welburn
233 22 Steve Welburn
Hard drive manufacturers often quote yearly failure rates below 2% [2]
234 22 Steve Welburn
User studies have seen rates as high as 6% [9]
235 22 Steve Welburn
236 22 Steve Welburn
Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7]
237 22 Steve Welburn
Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11]
238 22 Steve Welburn
239 22 Steve Welburn
Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].
240 22 Steve Welburn
241 22 Steve Welburn
Sharing Detailed Research Data Is Associated with Increased Citation Rate
242 22 Steve Welburn
243 22 Steve Welburn
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000308
244 22 Steve Welburn
245 22 Steve Welburn
246 22 Steve Welburn
Gleditsch, N.P., C. Metelits and H. Strand. 2003. Posting your data: Will you be scooped or will you be famous?.
247 22 Steve Welburn
248 22 Steve Welburn
Int. Stud. Perspect. 4:89–97.
249 22 Steve Welburn
250 22 Steve Welburn
Freckleton, R.P., P. Hulme, P. Giller and G. Kerby. 2005. The changing face of applied ecology.
251 22 Steve Welburn
252 22 Steve Welburn
J. Appl. Ecol. 42:1–3.
253 22 Steve Welburn
254 22 Steve Welburn
Lessons from the JMCB Archive
255 22 Steve Welburn
256 22 Steve Welburn
http://muse.jhu.edu/journals/mcb/summary/v038/38.4mccullough.html