Evidence Promoting Good Data Management » History » Version 28
Steve Welburn, 2012-11-12 02:23 PM
1 | 1 | Steve Welburn | h1. Evidence Promoting Good Data Management |
---|---|---|---|
2 | 1 | Steve Welburn | |
3 | 18 | Steve Welburn | {{>toc}} |
4 | 18 | Steve Welburn | |
5 | 20 | Steve Welburn | If you have any additional examples that you would like to share, please email them to: rdm.c4dm at gmail.com |
6 | 11 | Steve Welburn | |
7 | 1 | Steve Welburn | h2. Anecdotal Tales Of Lost Data |
8 | 1 | Steve Welburn | |
9 | 4 | Steve Welburn | h3. Recovery of Overwritten Hard Disk Data |
10 | 5 | Steve Welburn | |
11 | 5 | Steve Welburn | 5 October 2005 Linux Forums - http://tinyurl.com/8t7uaop |
12 | 5 | Steve Welburn | |
13 | 3 | Steve Welburn | <pre> |
14 | 21 | Steve Welburn | Hi, a friend of mine just overwrote two months of her |
15 | 21 | Steve Welburn | PhD thesis with an older version. I know recovery of |
16 | 21 | Steve Welburn | overwritten data is possible, but wonder if I'd need |
17 | 21 | Steve Welburn | special hardware to do it. Does anyone know something |
18 | 21 | Steve Welburn | about this ? |
19 | 21 | Steve Welburn | |
20 | 2 | Steve Welburn | Thank You. |
21 | 3 | Steve Welburn | </pre> |
22 | 1 | Steve Welburn | |
23 | 1 | Steve Welburn | h3. Stolen laptop had PhD research |
24 | 5 | Steve Welburn | |
25 | 5 | Steve Welburn | 19 March 2008 Surrey Leader - http://tinyurl.com/9hmtlv4 |
26 | 5 | Steve Welburn | |
27 | 1 | Steve Welburn | <pre> |
28 | 23 | Steve Welburn | Thirty-fve minutes spent in Langley’s Willowbrook |
29 | 23 | Steve Welburn | Shopping Centre cost a Surrey woman much more than |
30 | 23 | Steve Welburn | she had anticipated. |
31 | 23 | Steve Welburn | |
32 | 23 | Steve Welburn | Langley RCMP say that while she was shopping from |
33 | 23 | Steve Welburn | 1-1:35 p.m. last Monday, someone broke into her |
34 | 23 | Steve Welburn | vehicle and stole a number of items, including |
35 | 23 | Steve Welburn | a Mac iBook laptop containing the research she had |
36 | 23 | Steve Welburn | compiled as she worked towards her PhD. |
37 | 23 | Steve Welburn | |
38 | 24 | Steve Welburn | “All that information was on that computer and she |
39 | 24 | Steve Welburn | has no back-up file,” said Langley RCMP spokesman |
40 | 24 | Steve Welburn | Cpl. Brenda Marshall. |
41 | 3 | Steve Welburn | </pre> |
42 | 6 | Steve Welburn | |
43 | 6 | Steve Welburn | h3. Happiness is the return of a stolen computer, with data intact |
44 | 6 | Steve Welburn | |
45 | 6 | Steve Welburn | 27 May 2010 The Press, NZ - http://tinyurl.com/38sznnh |
46 | 6 | Steve Welburn | |
47 | 6 | Steve Welburn | <pre> |
48 | 6 | Steve Welburn | Never has a man been so happy to see a computer full of data |
49 | 6 | Steve Welburn | spreadsheets. |
50 | 6 | Steve Welburn | |
51 | 6 | Steve Welburn | Claudio De Sassi's world fell apart when a car containing almost three |
52 | 6 | Steve Welburn | years work towards his PhD was stolen two weeks ago. |
53 | 6 | Steve Welburn | De Sassi, a Canterbury University academic, could not hide his joy |
54 | 6 | Steve Welburn | yesterday as police reunited him with his stolen laptop and backpack. |
55 | 6 | Steve Welburn | </pre> |
56 | 6 | Steve Welburn | |
57 | 19 | Steve Welburn | h3. Thugs steal Christmas, doctoral dreams |
58 | 8 | Steve Welburn | |
59 | 8 | Steve Welburn | 22 December 2010 KRQE - http://tinyurl.com/9a5j56f |
60 | 8 | Steve Welburn | |
61 | 8 | Steve Welburn | <pre> |
62 | 8 | Steve Welburn | A tiny television sits where a big screen used to, and a Christmas tree |
63 | 8 | Steve Welburn | stands with little underneath it... |
64 | 8 | Steve Welburn | |
65 | 8 | Steve Welburn | Even worse than the gifts, the crooks stole a MacBook Pro laptop and a |
66 | 8 | Steve Welburn | LaCie hard drive. |
67 | 8 | Steve Welburn | |
68 | 8 | Steve Welburn | The hard drive had … her dissertation and nearly seven years of |
69 | 8 | Steve Welburn | research for her doctoral degree she was set to fnish in a few weeks. |
70 | 8 | Steve Welburn | Osuna had everything backed up on a separate hard drive in a safe, but |
71 | 8 | Steve Welburn | burglars made off with that too. |
72 | 8 | Steve Welburn | |
73 | 8 | Steve Welburn | "All I could think about is that all that time is gone, all that effort, |
74 | 8 | Steve Welburn | everything is gone," Osuna said. |
75 | 9 | Steve Welburn | </pre> |
76 | 8 | Steve Welburn | |
77 | 8 | Steve Welburn | |
78 | 8 | Steve Welburn | h3. Laptop Stolen From OSU Doctoral Student |
79 | 8 | Steve Welburn | |
80 | 8 | Steve Welburn | NBC4i January 06 2011 - http://tinyurl.com/bmybv9x |
81 | 8 | Steve Welburn | |
82 | 8 | Steve Welburn | <pre> |
83 | 8 | Steve Welburn | ...her car was broken into and her chrome Mac book pro was stolen. |
84 | 8 | Steve Welburn | She has a back-up for all but the last six months of research, but the |
85 | 8 | Steve Welburn | most important part of the research had happened recently. |
86 | 8 | Steve Welburn | </pre> |
87 | 8 | Steve Welburn | |
88 | 6 | Steve Welburn | h2. The Lost Laptop Problem |
89 | 6 | Steve Welburn | |
90 | 6 | Steve Welburn | * 2010 Ponemon Institute report for Intel re. US laptops |
91 | 6 | Steve Welburn | ** On average, 2.3% of laptops assigned to employees are lost each year |
92 | 25 | Steve Welburn | ** In education & research that rises to 3.7%, with 10.8% of laptops being lost before the end of their useful life |
93 | 25 | Steve Welburn | *** ~3 years i.e. within 1 PhD of allocation! |
94 | 6 | Steve Welburn | ** 75% lost outside the workplace |
95 | 6 | Steve Welburn | * Very similar results from 2011 European report! |
96 | 6 | Steve Welburn | |
97 | 6 | Steve Welburn | Intel 2010 - http://tinyurl.com/8c9m4bn |
98 | 7 | Steve Welburn | |
99 | 7 | Steve Welburn | h2. Laptop Reliability |
100 | 7 | Steve Welburn | |
101 | 7 | Steve Welburn | * 2011 PC World Laptop Reliability Survey from 63,000 readers: |
102 | 7 | Steve Welburn | ** 22.6% had signifcant problems during the product's lifetime |
103 | 7 | Steve Welburn | ** Of which... |
104 | 7 | Steve Welburn | *** 19% had OS problems ~1 in 25 of all laptops |
105 | 7 | Steve Welburn | *** 18% had HDD problems ~1 in 25 of all laptops |
106 | 7 | Steve Welburn | *** 10% PSU problems ~1 in 50 of all laptops |
107 | 7 | Steve Welburn | |
108 | 7 | Steve Welburn | PC World 2011 - http://tinyurl.com/876qza5 |
109 | 8 | Steve Welburn | |
110 | 8 | Steve Welburn | h2. Hard Disk Failures |
111 | 8 | Steve Welburn | |
112 | 8 | Steve Welburn | * Failure Trends In A Large Disk Drive Population |
113 | 8 | Steve Welburn | ** Usenix conference on File and Storage Technologies 2007 (FAST '07) |
114 | 8 | Steve Welburn | ** Eduardo Pinheiro & Wolf-Dietrich Weber, Google Inc. |
115 | 8 | Steve Welburn | * Data collected from over 100,000 disk drives at Google |
116 | 8 | Steve Welburn | * As part of repairs procedures: |
117 | 8 | Steve Welburn | ** ~13% of disk drives replaced over 3 years |
118 | 8 | Steve Welburn | ** ~20% of disk drives replaced over 4 years |
119 | 8 | Steve Welburn | |
120 | 8 | Steve Welburn | Article: http://tinyurl.com/octz6b |
121 | 8 | Steve Welburn | |
122 | 8 | Steve Welburn | h2. Data management in the cloud |
123 | 8 | Steve Welburn | |
124 | 8 | Steve Welburn | See JISC/DCC document "Curation In The Cloud" - http://tinyurl.com/8nogtmv |
125 | 8 | Steve Welburn | |
126 | 8 | Steve Welburn | Service agreements may give wide-ranging rights to the data service. |
127 | 8 | Steve Welburn | |
128 | 8 | Steve Welburn | h3. Google Terms Of Service |
129 | 8 | Steve Welburn | |
130 | 8 | Steve Welburn | 1 March 2012 Google Terms of Service : http://tinyurl.com/89dc9fa |
131 | 8 | Steve Welburn | |
132 | 8 | Steve Welburn | <pre> |
133 | 8 | Steve Welburn | When you upload or otherwise submit content to our Services, you give |
134 | 8 | Steve Welburn | Google (and those we work with) a worldwide license to use, host, store, |
135 | 8 | Steve Welburn | reproduce, modify, create derivative works (such as those resulting from |
136 | 8 | Steve Welburn | translations, adaptations or other changes we make so that your |
137 | 8 | Steve Welburn | content works better with our Services), communicate, publish, publicly |
138 | 8 | Steve Welburn | perform, publicly display and distribute such content. The rights you |
139 | 8 | Steve Welburn | grant in this license are for the limited purpose of operating, promoting, |
140 | 8 | Steve Welburn | and improving our Services, and to develop new ones. This license |
141 | 8 | Steve Welburn | continues even if you stop using our Services (for example, for a |
142 | 8 | Steve Welburn | business listing you have added to Google Maps). |
143 | 8 | Steve Welburn | </pre> |
144 | 8 | Steve Welburn | |
145 | 8 | Steve Welburn | h3. Microsoft Services Agreement |
146 | 8 | Steve Welburn | |
147 | 10 | Steve Welburn | 19 October 2012 Microsoft services agreement : http://tinyurl.com/8e4kucy |
148 | 8 | Steve Welburn | |
149 | 8 | Steve Welburn | <pre> |
150 | 8 | Steve Welburn | When you upload your content to the services, you agree that it may |
151 | 8 | Steve Welburn | be used, modifed, adapted, saved, reproduced, distributed, and |
152 | 8 | Steve Welburn | displayed to the extent necessary to protect you and to provide, protect |
153 | 8 | Steve Welburn | and improve Microsoft products and services. For example, we may |
154 | 8 | Steve Welburn | occasionally use automated means to isolate information from email, |
155 | 8 | Steve Welburn | chats, or photos in order to help detect and protect against spam and |
156 | 8 | Steve Welburn | malware, or to improve the services with new features that makes them |
157 | 8 | Steve Welburn | easier to use. When processing your content, Microsoft takes steps to |
158 | 8 | Steve Welburn | help preserve your privacy. |
159 | 8 | Steve Welburn | </pre> |
160 | 8 | Steve Welburn | |
161 | 8 | Steve Welburn | h2. Archiving Data |
162 | 8 | Steve Welburn | |
163 | 8 | Steve Welburn | h3. BBC Domesday Project |
164 | 8 | Steve Welburn | |
165 | 8 | Steve Welburn | 1986 Project to do a modern-day Domesday book (early crowd-sourcing) |
166 | 8 | Steve Welburn | * Used “BBC Master” computers with data on laserdisc |
167 | 8 | Steve Welburn | * Collected 147,819 pages of text and 23,225 photos |
168 | 8 | Steve Welburn | * Media expiring and obsolete technology put the data at risk! |
169 | 8 | Steve Welburn | |
170 | 8 | Steve Welburn | Domesday Reloaded (2011) |
171 | 8 | Steve Welburn | * Required emulation of software |
172 | 8 | Steve Welburn | * Images restored from original masters |
173 | 8 | Steve Welburn | * http://www.bbc.co.uk/history/domesday |
174 | 8 | Steve Welburn | |
175 | 8 | Steve Welburn | To allow long-term access to data |
176 | 8 | Steve Welburn | * Don't use obscure formats! |
177 | 8 | Steve Welburn | * Don't use obscure media! |
178 | 8 | Steve Welburn | * Don't rely on technology being available! |
179 | 8 | Steve Welburn | * Do keep original source material! |
180 | 12 | Steve Welburn | |
181 | 15 | Steve Welburn | Google images for "BBC Domesday":https://www.google.co.uk/search?tbm=isch&q=bbc+domesday |
182 | 12 | Steve Welburn | |
183 | 27 | Steve Welburn | h2. Sharing Data |
184 | 27 | Steve Welburn | |
185 | 27 | Steve Welburn | "Sharing Detailed Research Data Is Associated with Increased Citation Rate":http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000308 |
186 | 27 | Steve Welburn | |
187 | 27 | Steve Welburn | |
188 | 12 | Steve Welburn | h2. Related Media |
189 | 12 | Steve Welburn | |
190 | 12 | Steve Welburn | h3. Disk Drives Break |
191 | 12 | Steve Welburn | |
192 | 12 | Steve Welburn | "DataCent collection of disk drive failure sounds":http://datacent.com/hard_drive_sounds.php |
193 | 12 | Steve Welburn | |
194 | 12 | Steve Welburn | h3. Buildings burn down |
195 | 12 | Steve Welburn | |
196 | 12 | Steve Welburn | "Southampton University Mountbatten Building Fire":http://www.flickr.com/search/?q=Southampton%20University%20Mountbatten%20Building%20Fire |
197 | 12 | Steve Welburn | |
198 | 12 | Steve Welburn | h3. Laptops Break / Get Broken |
199 | 13 | Steve Welburn | |
200 | 13 | Steve Welburn | * "Shot laptop":http://lilysussman.wordpress.com/tag/laptop-destroyed/ |
201 | 22 | Steve Welburn | * "Google images of broken laptops":https://www.google.co.uk/search?q=broken%20laptop&um=1&tbm=isch |
202 | 1 | Steve Welburn | |
203 | 27 | Steve Welburn | h1. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
204 | 22 | Steve Welburn | |
205 | 22 | Steve Welburn | h3. Failure Trends In A Large Disk Drive Population |
206 | 22 | Steve Welburn | |
207 | 22 | Steve Welburn | Identified ~13% of hard drives being replaced over 3 years, 20% over 4 years as a result of a repair being required! |
208 | 22 | Steve Welburn | |
209 | 22 | Steve Welburn | FAST '07 paper on "Failure Trends In A Large Disk Drive Population":https://www.usenix.org/conference/fast-07/failure-trends-large-disk-drive-population |
210 | 22 | Steve Welburn | |
211 | 22 | Steve Welburn | Google report on over 100,000 consumer-grade disk drives from 80-400 GB produced in or after 2001 and used within Google. Data collected December 2005 - August 2006. Disk drives had a burn-in process and only those that were commissioned for use were included in the study - certain basic defects may well be excluded from this report. Also, discs were largely use in servers resulting in (relatively) large hours used relative to desktop / laptop computers. |
212 | 22 | Steve Welburn | |
213 | 22 | Steve Welburn | bq. the most accurate definition we can present of a failure event for our study is: a drive is considered to have failed if it was replaced as part of a repairs procedure. Note that this definition implicitly excludes drives that were replaced due to an upgrade. |
214 | 22 | Steve Welburn | |
215 | 22 | Steve Welburn | ~3% in first 3 months, ~2% up to 1 year, ~8% @ 2 years, ~9% @ 3 years, ~6% @ 4 years, ~7% @ 5 years |
216 | 22 | Steve Welburn | |
217 | 22 | Steve Welburn | NB: Variation with model and manufacturer! |
218 | 22 | Steve Welburn | |
219 | 22 | Steve Welburn | In the first 6 months, the risk of failure is highest for low & high utilisation! |
220 | 22 | Steve Welburn | * ~10% for high utilisation in the first 3 months |
221 | 22 | Steve Welburn | * for 3-year old drives ~4-5% chance of failure whatever the utilisation |
222 | 22 | Steve Welburn | * failures are most likely at low drive temperatures (on start-up ?) i.e. < 25 deg. C |
223 | 22 | Steve Welburn | * drives over 2 years old are most likely to fail at high temperatures (could be mode of failure ?) |
224 | 22 | Steve Welburn | |
225 | 22 | Steve Welburn | Disks with SMART scan errors are 10 times more likely to fail - almost 30% of drives with a SMART scan error failed within 8 months of the error. |
226 | 22 | Steve Welburn | * If a drive up to 8 months old gets a scan error, there's a 90% chance of it surviving at least 8 months |
227 | 22 | Steve Welburn | * If a drive over 2 years old gets a scan error, there's a 60% chance of it surviving at least 8 months |
228 | 22 | Steve Welburn | * If you have more than 1 scan error on a drive, it's significantly less likely to survive |
229 | 22 | Steve Welburn | * Similar for SMART reallocation counts AFR almost 20% if reallocation occurs in first 3 months |
230 | 22 | Steve Welburn | * ...but over 36% of failed drives had zero counts on all variables |
231 | 22 | Steve Welburn | |
232 | 28 | Steve Welburn | |
233 | 22 | Steve Welburn | bq. Talagala and Patterson [20] perform a detailed error analysis of 368 SCSI disk drives over an eighteen month period, reporting a failure rate of 1.9%. Results on a larger number of desktop-class ATA drives under deployment at the Internet Archive are presented by Schwarz et al [17]. They report on a 2% failure rate for a population of 2489 disks during 2005, while mentioning that replacement rates have been as high as 6% in the past. Gray and van Ingen [9] cite observed failure rates ranging from 3.3-6% in two large web properties with 22,400 and 15,805 disks respectively. A recent study by Schroeder and Gibson [16] helps shed light into the statistical properties of disk drive failures. The study uses failure data from several large scale deployments, including a large number of SATA drives. They report a significant overestimation of mean time to failure by manufacturers and a lack of infant mortality effects. None of these user studies have attempted to correlate failures with SMART parameters or other environmental factors. |
234 | 22 | Steve Welburn | |
235 | 22 | Steve Welburn | |
236 | 22 | Steve Welburn | Hard drive manufacturers often quote yearly failure rates below 2% [2] |
237 | 22 | Steve Welburn | User studies have seen rates as high as 6% [9] |
238 | 22 | Steve Welburn | |
239 | 22 | Steve Welburn | Between 15-60% of drives returned to manufacturers having been considered to have failed by users have no defect as far as the manufacturers are concerned [7] |
240 | 22 | Steve Welburn | Between 20-30% “no problem found” cases were observed after analyzing failed drives from a study of 3477 disks [11] |
241 | 22 | Steve Welburn | |
242 | 22 | Steve Welburn | Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18]. |
243 | 22 | Steve Welburn | |
244 | 22 | Steve Welburn | |
245 | 22 | Steve Welburn | |
246 | 22 | Steve Welburn | Gleditsch, N.P., C. Metelits and H. Strand. 2003. Posting your data: Will you be scooped or will you be famous?. |
247 | 22 | Steve Welburn | |
248 | 22 | Steve Welburn | Int. Stud. Perspect. 4:89–97. |
249 | 22 | Steve Welburn | |
250 | 22 | Steve Welburn | Freckleton, R.P., P. Hulme, P. Giller and G. Kerby. 2005. The changing face of applied ecology. |
251 | 22 | Steve Welburn | |
252 | 22 | Steve Welburn | J. Appl. Ecol. 42:1–3. |
253 | 22 | Steve Welburn | |
254 | 22 | Steve Welburn | Lessons from the JMCB Archive |
255 | 22 | Steve Welburn | |
256 | 22 | Steve Welburn | http://muse.jhu.edu/journals/mcb/summary/v038/38.4mccullough.html |
257 | 28 | Steve Welburn | |
258 | 28 | Steve Welburn | |
259 | 28 | Steve Welburn | |
260 | 28 | Steve Welburn | h2. More To Read |
261 | 28 | Steve Welburn | |
262 | 28 | Steve Welburn | Schroeder, Bianca, and Garth A. Gibson. "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you." Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST). 2007. |
263 | 28 | Steve Welburn | |
264 | 28 | Steve Welburn | Lancaster, Larry, and Alan Rowe. "Measuring Real World Data Availability.":http://static.usenix.org/publications/library/proceedings/lisa2001/tech/full_papers/lancaster/lancaster_html/ Proceedings of the LISA 2001 15th Systems Administration Conference. 2001. |