Evidence Promoting Good Data Management » History » Version 23

Steve Welburn, 2012-11-12 02:13 PM

1 1 Steve Welburn
h1. Evidence Promoting Good Data Management
2 1 Steve Welburn
3 18 Steve Welburn
{{>toc}}
4 18 Steve Welburn
5 20 Steve Welburn
If you have any additional examples that you would like to share, please email them to: rdm.c4dm at gmail.com
6 11 Steve Welburn
7 1 Steve Welburn
h2. Anecdotal Tales Of Lost Data
8 1 Steve Welburn
9 4 Steve Welburn
h3. Recovery of Overwritten Hard Disk Data
10 5 Steve Welburn
11 5 Steve Welburn
5 October 2005 Linux Forums - http://tinyurl.com/8t7uaop
12 5 Steve Welburn
13 3 Steve Welburn
<pre>
14 21 Steve Welburn
Hi, a friend of mine just overwrote two months of her
15 21 Steve Welburn
PhD thesis with an older version. I know recovery of
16 21 Steve Welburn
overwritten data is possible, but wonder if I'd need
17 21 Steve Welburn
special hardware to do it. Does anyone know something
18 21 Steve Welburn
about this ?
19 21 Steve Welburn
20 2 Steve Welburn
Thank You.
21 3 Steve Welburn
</pre>
22 1 Steve Welburn
23 1 Steve Welburn
h3. Stolen laptop had PhD research
24 5 Steve Welburn
25 5 Steve Welburn
19 March 2008 Surrey Leader - http://tinyurl.com/9hmtlv4
26 5 Steve Welburn
27 1 Steve Welburn
<pre>
28 23 Steve Welburn
Thirty-fve minutes spent in Langley’s Willowbrook
29 23 Steve Welburn
Shopping Centre cost a Surrey woman much more than
30 23 Steve Welburn
she had anticipated.
31 23 Steve Welburn
32 23 Steve Welburn
Langley RCMP say that while she was shopping from
33 23 Steve Welburn
1-1:35 p.m. last Monday, someone broke into her
34 23 Steve Welburn
vehicle and stole a number of items, including
35 23 Steve Welburn
a Mac iBook laptop containing the research she had 
36 23 Steve Welburn
compiled as she worked towards her PhD.
37 23 Steve Welburn
38 2 Steve Welburn
“All that information was on that computer and she has no back-up
39 23 Steve Welburn
file,” said Langley RCMP spokesman Cpl. Brenda Marshall.
40 3 Steve Welburn
</pre>
41 6 Steve Welburn
42 6 Steve Welburn
h3. Happiness is the return of a stolen computer, with data intact
43 6 Steve Welburn
44 6 Steve Welburn
27 May 2010 The Press, NZ - http://tinyurl.com/38sznnh
45 6 Steve Welburn
46 6 Steve Welburn
<pre>
47 6 Steve Welburn
Never has a man been so happy to see a computer full of data
48 6 Steve Welburn
spreadsheets.
49 6 Steve Welburn
50 6 Steve Welburn
Claudio De Sassi's world fell apart when a car containing almost three
51 6 Steve Welburn
years work towards his PhD was stolen two weeks ago.
52 6 Steve Welburn
De Sassi, a Canterbury University academic, could not hide his joy
53 6 Steve Welburn
yesterday as police reunited him with his stolen laptop and backpack.
54 6 Steve Welburn
</pre>
55 6 Steve Welburn
56 19 Steve Welburn
h3. Thugs steal Christmas, doctoral dreams
57 8 Steve Welburn
58 8 Steve Welburn
22 December 2010 KRQE - http://tinyurl.com/9a5j56f
59 8 Steve Welburn
60 8 Steve Welburn
<pre>
61 8 Steve Welburn
A tiny television sits where a big screen used to, and a Christmas tree
62 8 Steve Welburn
stands with little underneath it...
63 8 Steve Welburn
64 8 Steve Welburn
Even worse than the gifts, the crooks stole a MacBook Pro laptop and a
65 8 Steve Welburn
LaCie hard drive.
66 8 Steve Welburn
67 8 Steve Welburn
The hard drive had … her dissertation and nearly seven years of
68 8 Steve Welburn
research for her doctoral degree she was set to fnish in a few weeks.
69 8 Steve Welburn
Osuna had everything backed up on a separate hard drive in a safe, but
70 8 Steve Welburn
burglars made off with that too.
71 8 Steve Welburn
72 8 Steve Welburn
"All I could think about is that all that time is gone, all that effort,
73 8 Steve Welburn
everything is gone," Osuna said.
74 9 Steve Welburn
</pre>
75 8 Steve Welburn
76 8 Steve Welburn
77 8 Steve Welburn
h3. Laptop Stolen From OSU Doctoral Student
78 8 Steve Welburn
79 8 Steve Welburn
NBC4i January 06 2011 - http://tinyurl.com/bmybv9x
80 8 Steve Welburn
81 8 Steve Welburn
<pre>
82 8 Steve Welburn
...her car was broken into and her chrome Mac book pro was stolen.
83 8 Steve Welburn
She has a back-up for all but the last six months of research, but the
84 8 Steve Welburn
most important part of the research had happened recently.
85 8 Steve Welburn
</pre>
86 8 Steve Welburn
87 6 Steve Welburn
h2. The Lost Laptop Problem
88 6 Steve Welburn
89 6 Steve Welburn
* 2010 Ponemon Institute report for Intel re. US laptops
90 6 Steve Welburn
** On average, 2.3% of laptops assigned to employees are lost each year
91 6 Steve Welburn
** In education & research that rises to 3.7%, with 10.8% of laptops being lost before the end of their useful life (~3 years i.e. within 1 PhD of allocation!
92 6 Steve Welburn
** 75% lost outside the workplace
93 6 Steve Welburn
* Very similar results from 2011 European report!
94 6 Steve Welburn
95 6 Steve Welburn
Intel 2010 - http://tinyurl.com/8c9m4bn
96 7 Steve Welburn
97 7 Steve Welburn
h2. Laptop Reliability
98 7 Steve Welburn
99 7 Steve Welburn
* 2011 PC World Laptop Reliability Survey from 63,000 readers:
100 7 Steve Welburn
** 22.6% had signifcant problems during the product's lifetime
101 7 Steve Welburn
** Of which...
102 7 Steve Welburn
*** 19% had OS problems ~1 in 25 of all laptops
103 7 Steve Welburn
*** 18% had HDD problems ~1 in 25 of all laptops
104 7 Steve Welburn
*** 10% PSU problems ~1 in 50 of all laptops
105 7 Steve Welburn
106 7 Steve Welburn
PC World 2011 - http://tinyurl.com/876qza5
107 8 Steve Welburn
108 8 Steve Welburn
h2. Hard Disk Failures
109 8 Steve Welburn
110 8 Steve Welburn
* Failure Trends In A Large Disk Drive Population
111 8 Steve Welburn
** Usenix conference on File and Storage Technologies 2007 (FAST '07)
112 8 Steve Welburn
** Eduardo Pinheiro & Wolf-Dietrich Weber, Google Inc.
113 8 Steve Welburn
* Data collected from over 100,000 disk drives at Google
114 8 Steve Welburn
* As part of repairs procedures:
115 8 Steve Welburn
** ~13% of disk drives replaced over 3 years
116 8 Steve Welburn
** ~20% of disk drives replaced over 4 years
117 8 Steve Welburn
118 8 Steve Welburn
Article: http://tinyurl.com/octz6b
119 8 Steve Welburn
120 8 Steve Welburn
h2. Data management in the cloud
121 8 Steve Welburn
122 8 Steve Welburn
See JISC/DCC document "Curation In The Cloud" - http://tinyurl.com/8nogtmv
123 8 Steve Welburn
124 8 Steve Welburn
Service agreements may give wide-ranging rights to the data service.
125 8 Steve Welburn
126 8 Steve Welburn
h3. Google Terms Of Service
127 8 Steve Welburn
128 8 Steve Welburn
1 March 2012 Google Terms of Service : http://tinyurl.com/89dc9fa
129 8 Steve Welburn
130 8 Steve Welburn
<pre>
131 8 Steve Welburn
When you upload or otherwise submit content to our Services, you give
132 8 Steve Welburn
Google (and those we work with) a worldwide license to use, host, store,
133 8 Steve Welburn
reproduce, modify, create derivative works (such as those resulting from
134 8 Steve Welburn
translations, adaptations or other changes we make so that your
135 8 Steve Welburn
content works better with our Services), communicate, publish, publicly
136 8 Steve Welburn
perform, publicly display and distribute such content. The rights you
137 8 Steve Welburn
grant in this license are for the limited purpose of operating, promoting,
138 8 Steve Welburn
and improving our Services, and to develop new ones. This license
139 8 Steve Welburn
continues even if you stop using our Services (for example, for a
140 8 Steve Welburn
business listing you have added to Google Maps).
141 8 Steve Welburn
</pre>
142 8 Steve Welburn
143 8 Steve Welburn
h3. Microsoft Services Agreement
144 8 Steve Welburn
145 10 Steve Welburn
19 October 2012 Microsoft services agreement : http://tinyurl.com/8e4kucy
146 8 Steve Welburn
147 8 Steve Welburn
<pre>
148 8 Steve Welburn
When you upload your content to the services, you agree that it may
149 8 Steve Welburn
be used, modifed, adapted, saved, reproduced, distributed, and
150 8 Steve Welburn
displayed to the extent necessary to protect you and to provide, protect
151 8 Steve Welburn
and improve Microsoft products and services. For example, we may
152 8 Steve Welburn
occasionally use automated means to isolate information from email,
153 8 Steve Welburn
chats, or photos in order to help detect and protect against spam and
154 8 Steve Welburn
malware, or to improve the services with new features that makes them
155 8 Steve Welburn
easier to use. When processing your content, Microsoft takes steps to
156 8 Steve Welburn
help preserve your privacy.
157 8 Steve Welburn
</pre>
158 8 Steve Welburn
159 8 Steve Welburn
h2. Archiving Data
160 8 Steve Welburn
161 8 Steve Welburn
h3. BBC Domesday Project
162 8 Steve Welburn
163 8 Steve Welburn
1986 Project to do a modern-day Domesday book (early crowd-sourcing)
164 8 Steve Welburn
* Used “BBC Master” computers with data on laserdisc
165 8 Steve Welburn
* Collected 147,819 pages of text and 23,225 photos
166 8 Steve Welburn
* Media expiring and obsolete technology put the data at risk!
167 8 Steve Welburn
168 8 Steve Welburn
Domesday Reloaded (2011)
169 8 Steve Welburn
* Required emulation of software
170 8 Steve Welburn
* Images restored from original masters
171 8 Steve Welburn
* http://www.bbc.co.uk/history/domesday
172 8 Steve Welburn
173 8 Steve Welburn
To allow long-term access to data
174 8 Steve Welburn
* Don't use obscure formats!
175 8 Steve Welburn
* Don't use obscure media!
176 8 Steve Welburn
* Don't rely on technology being available!
177 8 Steve Welburn
* Do keep original source material!
178 12 Steve Welburn
179 15 Steve Welburn
Google images for "BBC Domesday":https://www.google.co.uk/search?tbm=isch&q=bbc+domesday
180 12 Steve Welburn
181 12 Steve Welburn
h2. Related Media
182 12 Steve Welburn
183 12 Steve Welburn
h3. Disk Drives Break
184 12 Steve Welburn
185 12 Steve Welburn
"DataCent collection of disk drive failure sounds":http://datacent.com/hard_drive_sounds.php
186 12 Steve Welburn
187 12 Steve Welburn
h3. Buildings burn down
188 12 Steve Welburn
189 12 Steve Welburn
"Southampton University Mountbatten Building Fire":http://www.flickr.com/search/?q=Southampton%20University%20Mountbatten%20Building%20Fire
190 12 Steve Welburn
191 12 Steve Welburn
h3. Laptops Break / Get Broken
192 12 Steve Welburn
193 13 Steve Welburn
* "Shot laptop":http://lilysussman.wordpress.com/tag/laptop-destroyed/
194 13 Steve Welburn
* "Google images of broken laptops":https://www.google.co.uk/search?q=broken%20laptop&um=1&tbm=isch
195 22 Steve Welburn
196 22 Steve Welburn
197 22 Steve Welburn
198 22 Steve Welburn
199 22 Steve Welburn
h1. xxxxxx
200 22 Steve Welburn
201 22 Steve Welburn
202 22 Steve Welburn
Ponemon reports for Intel on the "Lost Laptop problem" ~10% of Education and Research laptops are lost during their lifetime.
203 22 Steve Welburn
204 22 Steve Welburn
PC World study on laptop failure rates: 20-30% of laptops with a significant failure
205 22 Steve Welburn
206 22 Steve Welburn
h3. Failure Trends In A Large Disk Drive Population
207 22 Steve Welburn
208 22 Steve Welburn
Identified ~13% of hard drives being replaced over 3 years, 20% over 4 years as a result of a repair being required!
209 22 Steve Welburn
210 22 Steve Welburn
FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population
211 22 Steve Welburn
212 22 Steve Welburn
Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report. Also, discs were largely use in servers resulting in (relatively) large hours used relative to desktop / laptop computers.
213 22 Steve Welburn
214 22 Steve Welburn
bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade.
215 22 Steve Welburn
216 22 Steve Welburn
~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years
217 22 Steve Welburn
218 22 Steve Welburn
NB: Variation with model and manufacturer!
219 22 Steve Welburn
220 22 Steve Welburn
In the first 6 months, the risk of failure is highest for low & high utilisation!
221 22 Steve Welburn
* ~10% for high utilisation in the first 3 months
222 22 Steve Welburn
* for 3-year old drives ~4-5% chance of failure whatever the utilisation
223 22 Steve Welburn
* failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C
224 22 Steve Welburn
* drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?)
225 22 Steve Welburn
226 22 Steve Welburn
Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error.
227 22 Steve Welburn
* If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months
228 22 Steve Welburn
* If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months
229 22 Steve Welburn
* If you have more than 1 scan error on a drive, it's significantly less likely to survive
230 22 Steve Welburn
* Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months
231 22 Steve Welburn
* ...but over 36% of failed drives had zero counts on all variables
232 22 Steve Welburn
233 22 Steve Welburn
bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors.
234 22 Steve Welburn
235 22 Steve Welburn
236 22 Steve Welburn
Hard drive manufacturers often quote yearly failure rates below 2% [2]
237 22 Steve Welburn
User studies have seen rates as high as 6% [9]
238 22 Steve Welburn
239 22 Steve Welburn
Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7]
240 22 Steve Welburn
Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11]
241 22 Steve Welburn
242 22 Steve Welburn
Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].
243 22 Steve Welburn
244 22 Steve Welburn
Sharing Detailed Research Data Is Associated with Increased Citation Rate
245 22 Steve Welburn
246 22 Steve Welburn
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000308
247 22 Steve Welburn
248 22 Steve Welburn
249 22 Steve Welburn
Gleditsch, N.P., C. Metelits and H. Strand. 2003. Posting your data: Will you be scooped or will you be famous?.
250 22 Steve Welburn
251 22 Steve Welburn
Int. Stud. Perspect. 4:89–97.
252 22 Steve Welburn
253 22 Steve Welburn
Freckleton, R.P., P. Hulme, P. Giller and G. Kerby. 2005. The changing face of applied ecology.
254 22 Steve Welburn
255 22 Steve Welburn
J. Appl. Ecol. 42:1–3.
256 22 Steve Welburn
257 22 Steve Welburn
Lessons from the JMCB Archive
258 22 Steve Welburn
259 22 Steve Welburn
http://muse.jhu.edu/journals/mcb/summary/v038/38.4mccullough.html