Evidence Promoting Good Data Management » History » Version 25

Steve Welburn, 2012-11-12 02:14 PM

1 1 Steve Welburn
h1. Evidence Promoting Good Data Management
2 1 Steve Welburn
3 18 Steve Welburn
{{>toc}}
4 18 Steve Welburn
5 20 Steve Welburn
If you have any additional examples that you would like to share, please email them to: rdm.c4dm at gmail.com
6 11 Steve Welburn
7 1 Steve Welburn
h2. Anecdotal Tales Of Lost Data
8 1 Steve Welburn
9 4 Steve Welburn
h3. Recovery of Overwritten Hard Disk Data
10 5 Steve Welburn
11 5 Steve Welburn
5 October 2005 Linux Forums - http://tinyurl.com/8t7uaop
12 5 Steve Welburn
13 3 Steve Welburn
<pre>
14 21 Steve Welburn
Hi, a friend of mine just overwrote two months of her
15 21 Steve Welburn
PhD thesis with an older version. I know recovery of
16 21 Steve Welburn
overwritten data is possible, but wonder if I'd need
17 21 Steve Welburn
special hardware to do it. Does anyone know something
18 21 Steve Welburn
about this ?
19 21 Steve Welburn
20 2 Steve Welburn
Thank You.
21 3 Steve Welburn
</pre>
22 1 Steve Welburn
23 1 Steve Welburn
h3. Stolen laptop had PhD research
24 5 Steve Welburn
25 5 Steve Welburn
19 March 2008 Surrey Leader - http://tinyurl.com/9hmtlv4
26 5 Steve Welburn
27 1 Steve Welburn
<pre>
28 23 Steve Welburn
Thirty-fve minutes spent in Langley’s Willowbrook
29 23 Steve Welburn
Shopping Centre cost a Surrey woman much more than
30 23 Steve Welburn
she had anticipated.
31 23 Steve Welburn
32 23 Steve Welburn
Langley RCMP say that while she was shopping from
33 23 Steve Welburn
1-1:35 p.m. last Monday, someone broke into her
34 23 Steve Welburn
vehicle and stole a number of items, including
35 23 Steve Welburn
a Mac iBook laptop containing the research she had 
36 23 Steve Welburn
compiled as she worked towards her PhD.
37 23 Steve Welburn
38 24 Steve Welburn
“All that information was on that computer and she
39 24 Steve Welburn
has no back-up file,” said Langley RCMP spokesman
40 24 Steve Welburn
Cpl. Brenda Marshall.
41 3 Steve Welburn
</pre>
42 6 Steve Welburn
43 6 Steve Welburn
h3. Happiness is the return of a stolen computer, with data intact
44 6 Steve Welburn
45 6 Steve Welburn
27 May 2010 The Press, NZ - http://tinyurl.com/38sznnh
46 6 Steve Welburn
47 6 Steve Welburn
<pre>
48 6 Steve Welburn
Never has a man been so happy to see a computer full of data
49 6 Steve Welburn
spreadsheets.
50 6 Steve Welburn
51 6 Steve Welburn
Claudio De Sassi's world fell apart when a car containing almost three
52 6 Steve Welburn
years work towards his PhD was stolen two weeks ago.
53 6 Steve Welburn
De Sassi, a Canterbury University academic, could not hide his joy
54 6 Steve Welburn
yesterday as police reunited him with his stolen laptop and backpack.
55 6 Steve Welburn
</pre>
56 6 Steve Welburn
57 19 Steve Welburn
h3. Thugs steal Christmas, doctoral dreams
58 8 Steve Welburn
59 8 Steve Welburn
22 December 2010 KRQE - http://tinyurl.com/9a5j56f
60 8 Steve Welburn
61 8 Steve Welburn
<pre>
62 8 Steve Welburn
A tiny television sits where a big screen used to, and a Christmas tree
63 8 Steve Welburn
stands with little underneath it...
64 8 Steve Welburn
65 8 Steve Welburn
Even worse than the gifts, the crooks stole a MacBook Pro laptop and a
66 8 Steve Welburn
LaCie hard drive.
67 8 Steve Welburn
68 8 Steve Welburn
The hard drive had … her dissertation and nearly seven years of
69 8 Steve Welburn
research for her doctoral degree she was set to fnish in a few weeks.
70 8 Steve Welburn
Osuna had everything backed up on a separate hard drive in a safe, but
71 8 Steve Welburn
burglars made off with that too.
72 8 Steve Welburn
73 8 Steve Welburn
"All I could think about is that all that time is gone, all that effort,
74 8 Steve Welburn
everything is gone," Osuna said.
75 9 Steve Welburn
</pre>
76 8 Steve Welburn
77 8 Steve Welburn
78 8 Steve Welburn
h3. Laptop Stolen From OSU Doctoral Student
79 8 Steve Welburn
80 8 Steve Welburn
NBC4i January 06 2011 - http://tinyurl.com/bmybv9x
81 8 Steve Welburn
82 8 Steve Welburn
<pre>
83 8 Steve Welburn
...her car was broken into and her chrome Mac book pro was stolen.
84 8 Steve Welburn
She has a back-up for all but the last six months of research, but the
85 8 Steve Welburn
most important part of the research had happened recently.
86 8 Steve Welburn
</pre>
87 8 Steve Welburn
88 6 Steve Welburn
h2. The Lost Laptop Problem
89 6 Steve Welburn
90 6 Steve Welburn
* 2010 Ponemon Institute report for Intel re. US laptops
91 6 Steve Welburn
** On average, 2.3% of laptops assigned to employees are lost each year
92 25 Steve Welburn
** In education & research that rises to 3.7%, with 10.8% of laptops being lost before the end of their useful life
93 25 Steve Welburn
*** ~3 years i.e. within 1 PhD of allocation!
94 6 Steve Welburn
** 75% lost outside the workplace
95 6 Steve Welburn
* Very similar results from 2011 European report!
96 6 Steve Welburn
97 6 Steve Welburn
Intel 2010 - http://tinyurl.com/8c9m4bn
98 7 Steve Welburn
99 7 Steve Welburn
h2. Laptop Reliability
100 7 Steve Welburn
101 7 Steve Welburn
* 2011 PC World Laptop Reliability Survey from 63,000 readers:
102 7 Steve Welburn
** 22.6% had signifcant problems during the product's lifetime
103 7 Steve Welburn
** Of which...
104 7 Steve Welburn
*** 19% had OS problems ~1 in 25 of all laptops
105 7 Steve Welburn
*** 18% had HDD problems ~1 in 25 of all laptops
106 7 Steve Welburn
*** 10% PSU problems ~1 in 50 of all laptops
107 7 Steve Welburn
108 7 Steve Welburn
PC World 2011 - http://tinyurl.com/876qza5
109 8 Steve Welburn
110 8 Steve Welburn
h2. Hard Disk Failures
111 8 Steve Welburn
112 8 Steve Welburn
* Failure Trends In A Large Disk Drive Population
113 8 Steve Welburn
** Usenix conference on File and Storage Technologies 2007 (FAST '07)
114 8 Steve Welburn
** Eduardo Pinheiro & Wolf-Dietrich Weber, Google Inc.
115 8 Steve Welburn
* Data collected from over 100,000 disk drives at Google
116 8 Steve Welburn
* As part of repairs procedures:
117 8 Steve Welburn
** ~13% of disk drives replaced over 3 years
118 8 Steve Welburn
** ~20% of disk drives replaced over 4 years
119 8 Steve Welburn
120 8 Steve Welburn
Article: http://tinyurl.com/octz6b
121 8 Steve Welburn
122 8 Steve Welburn
h2. Data management in the cloud
123 8 Steve Welburn
124 8 Steve Welburn
See JISC/DCC document "Curation In The Cloud" - http://tinyurl.com/8nogtmv
125 8 Steve Welburn
126 8 Steve Welburn
Service agreements may give wide-ranging rights to the data service.
127 8 Steve Welburn
128 8 Steve Welburn
h3. Google Terms Of Service
129 8 Steve Welburn
130 8 Steve Welburn
1 March 2012 Google Terms of Service : http://tinyurl.com/89dc9fa
131 8 Steve Welburn
132 8 Steve Welburn
<pre>
133 8 Steve Welburn
When you upload or otherwise submit content to our Services, you give
134 8 Steve Welburn
Google (and those we work with) a worldwide license to use, host, store,
135 8 Steve Welburn
reproduce, modify, create derivative works (such as those resulting from
136 8 Steve Welburn
translations, adaptations or other changes we make so that your
137 8 Steve Welburn
content works better with our Services), communicate, publish, publicly
138 8 Steve Welburn
perform, publicly display and distribute such content. The rights you
139 8 Steve Welburn
grant in this license are for the limited purpose of operating, promoting,
140 8 Steve Welburn
and improving our Services, and to develop new ones. This license
141 8 Steve Welburn
continues even if you stop using our Services (for example, for a
142 8 Steve Welburn
business listing you have added to Google Maps).
143 8 Steve Welburn
</pre>
144 8 Steve Welburn
145 8 Steve Welburn
h3. Microsoft Services Agreement
146 8 Steve Welburn
147 10 Steve Welburn
19 October 2012 Microsoft services agreement : http://tinyurl.com/8e4kucy
148 8 Steve Welburn
149 8 Steve Welburn
<pre>
150 8 Steve Welburn
When you upload your content to the services, you agree that it may
151 8 Steve Welburn
be used, modifed, adapted, saved, reproduced, distributed, and
152 8 Steve Welburn
displayed to the extent necessary to protect you and to provide, protect
153 8 Steve Welburn
and improve Microsoft products and services. For example, we may
154 8 Steve Welburn
occasionally use automated means to isolate information from email,
155 8 Steve Welburn
chats, or photos in order to help detect and protect against spam and
156 8 Steve Welburn
malware, or to improve the services with new features that makes them
157 8 Steve Welburn
easier to use. When processing your content, Microsoft takes steps to
158 8 Steve Welburn
help preserve your privacy.
159 8 Steve Welburn
</pre>
160 8 Steve Welburn
161 8 Steve Welburn
h2. Archiving Data
162 8 Steve Welburn
163 8 Steve Welburn
h3. BBC Domesday Project
164 8 Steve Welburn
165 8 Steve Welburn
1986 Project to do a modern-day Domesday book (early crowd-sourcing)
166 8 Steve Welburn
* Used “BBC Master” computers with data on laserdisc
167 8 Steve Welburn
* Collected 147,819 pages of text and 23,225 photos
168 8 Steve Welburn
* Media expiring and obsolete technology put the data at risk!
169 8 Steve Welburn
170 8 Steve Welburn
Domesday Reloaded (2011)
171 8 Steve Welburn
* Required emulation of software
172 8 Steve Welburn
* Images restored from original masters
173 8 Steve Welburn
* http://www.bbc.co.uk/history/domesday
174 8 Steve Welburn
175 8 Steve Welburn
To allow long-term access to data
176 8 Steve Welburn
* Don't use obscure formats!
177 8 Steve Welburn
* Don't use obscure media!
178 8 Steve Welburn
* Don't rely on technology being available!
179 8 Steve Welburn
* Do keep original source material!
180 12 Steve Welburn
181 15 Steve Welburn
Google images for "BBC Domesday":https://www.google.co.uk/search?tbm=isch&q=bbc+domesday
182 12 Steve Welburn
183 12 Steve Welburn
h2. Related Media
184 12 Steve Welburn
185 12 Steve Welburn
h3. Disk Drives Break
186 12 Steve Welburn
187 12 Steve Welburn
"DataCent collection of disk drive failure sounds":http://datacent.com/hard_drive_sounds.php
188 12 Steve Welburn
189 12 Steve Welburn
h3. Buildings burn down
190 12 Steve Welburn
191 12 Steve Welburn
"Southampton University Mountbatten Building Fire":http://www.flickr.com/search/?q=Southampton%20University%20Mountbatten%20Building%20Fire
192 12 Steve Welburn
193 12 Steve Welburn
h3. Laptops Break / Get Broken
194 12 Steve Welburn
195 13 Steve Welburn
* "Shot laptop":http://lilysussman.wordpress.com/tag/laptop-destroyed/
196 13 Steve Welburn
* "Google images of broken laptops":https://www.google.co.uk/search?q=broken%20laptop&um=1&tbm=isch
197 22 Steve Welburn
198 22 Steve Welburn
199 22 Steve Welburn
200 22 Steve Welburn
201 22 Steve Welburn
h1. xxxxxx
202 22 Steve Welburn
203 22 Steve Welburn
204 22 Steve Welburn
Ponemon reports for Intel on the "Lost Laptop problem" ~10% of Education and Research laptops are lost during their lifetime.
205 22 Steve Welburn
206 22 Steve Welburn
PC World study on laptop failure rates: 20-30% of laptops with a significant failure
207 22 Steve Welburn
208 22 Steve Welburn
h3. Failure Trends In A Large Disk Drive Population
209 22 Steve Welburn
210 22 Steve Welburn
Identified ~13% of hard drives being replaced over 3 years, 20% over 4 years as a result of a repair being required!
211 22 Steve Welburn
212 22 Steve Welburn
FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population
213 22 Steve Welburn
214 22 Steve Welburn
Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report. Also, discs were largely use in servers resulting in (relatively) large hours used relative to desktop / laptop computers.
215 22 Steve Welburn
216 22 Steve Welburn
bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade.
217 22 Steve Welburn
218 22 Steve Welburn
~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years
219 22 Steve Welburn
220 22 Steve Welburn
NB: Variation with model and manufacturer!
221 22 Steve Welburn
222 22 Steve Welburn
In the first 6 months, the risk of failure is highest for low & high utilisation!
223 22 Steve Welburn
* ~10% for high utilisation in the first 3 months
224 22 Steve Welburn
* for 3-year old drives ~4-5% chance of failure whatever the utilisation
225 22 Steve Welburn
* failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C
226 22 Steve Welburn
* drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?)
227 22 Steve Welburn
228 22 Steve Welburn
Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error.
229 22 Steve Welburn
* If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months
230 22 Steve Welburn
* If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months
231 22 Steve Welburn
* If you have more than 1 scan error on a drive, it's significantly less likely to survive
232 22 Steve Welburn
* Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months
233 22 Steve Welburn
* ...but over 36% of failed drives had zero counts on all variables
234 22 Steve Welburn
235 22 Steve Welburn
bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors.
236 22 Steve Welburn
237 22 Steve Welburn
238 22 Steve Welburn
Hard drive manufacturers often quote yearly failure rates below 2% [2]
239 22 Steve Welburn
User studies have seen rates as high as 6% [9]
240 22 Steve Welburn
241 22 Steve Welburn
Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7]
242 22 Steve Welburn
Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11]
243 22 Steve Welburn
244 22 Steve Welburn
Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].
245 22 Steve Welburn
246 22 Steve Welburn
Sharing Detailed Research Data Is Associated with Increased Citation Rate
247 22 Steve Welburn
248 22 Steve Welburn
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000308
249 22 Steve Welburn
250 22 Steve Welburn
251 22 Steve Welburn
Gleditsch, N.P., C. Metelits and H. Strand. 2003. Posting your data: Will you be scooped or will you be famous?.
252 22 Steve Welburn
253 22 Steve Welburn
Int. Stud. Perspect. 4:89–97.
254 22 Steve Welburn
255 22 Steve Welburn
Freckleton, R.P., P. Hulme, P. Giller and G. Kerby. 2005. The changing face of applied ecology.
256 22 Steve Welburn
257 22 Steve Welburn
J. Appl. Ecol. 42:1–3.
258 22 Steve Welburn
259 22 Steve Welburn
Lessons from the JMCB Archive
260 22 Steve Welburn
261 22 Steve Welburn
http://muse.jhu.edu/journals/mcb/summary/v038/38.4mccullough.html