Evidence Promoting Good Data Management » History » Version 62
Steve Welburn, 2012-11-12 11:01 PM
h1. Evidence Promoting Good Data Management
If you have any additional examples that you would like to share, please email them to: rdm.c4dm at gmail.com
L'Aquila earthquake, Italy
* "Valuable Cancer Research Lost In Italian Earthquake":http://www.medindia.net/news/Valuable-Cancer-Research-Lost-In-Italian-Earthquake-49843-1.htm 12 April 2009 (medindia.net)
bq. A major casualty of the last week’s earthquake in Italy could be valuable research work done by a UK-based charity over the last two years.
bq. Leukaemia Busters, Southampton, has been developing pioneering drugs in a clinic in the quake-hit city of L'Aquila.
bq. Dr David Flavell, from the charity, said it was likely specially engineered leukaemia cells used to produce anti-bodies had been lost.
* "Leukaemia Busters' research survives Italian earthquake":http://www.dailyecho.co.uk/news/4283096.Leukaemia_research_survives_earthquake/ 10 April 2009 (Southern Daily Echo)
bq. Two years of life-saving research into the treatment of a killer disease feared lost forever by a Hampshire charity has incredibly survived the Italian earthquake disaster.
bq. Leukaemia Busters were delighted to discover that laboratories where scientists had spent the past two-and-a-half years working to develop pioneering drugs to fight leukaemia remain standing.
bq. The unbelievable news came after rescue workers allowed Professor Rodolfo Ippoliti into the devastated city of L’Aquila and see for himself the destruction caused by the 6.3 magnitude quake.
Tohoku earthquake, Japan 2011
* "Assistance for the researchers affected by the Tohoku Earthquake, A message from the Japanese scientist community at NIH":http://www.jsdb.jp/news/etc/etc165e.htm 2 April 2011
bq. We have heard that research facilities and equipment at many universities and research institutions in the Tohoku and Kanto regions were damaged as a result of this disaster, and many scientists and students have been forced to stop their research because their valuable research samples or data have been lost. All of the staff and the researchers at NIH are deeply distressed by the devastation that has struck Japan.
Southampton University Mountbatten building
* "Fire destroys top research centre ":http://news.bbc.co.uk/1/hi/england/hampshire/4390048.stm (BBC)
* "Images on Flickr":http://www.flickr.com/search/?q=Southampton%20University%20Mountbatten%20Building%20Fire
* "University vows to rebuild centre":http://news.bbc.co.uk/1/hi/england/hampshire/4394294.stm (BBC)
* "Fire at University of Southampton data recovery":http://www.computerweekly.com/photostory/2240109845/Photos-Fire-at-University-of-Southampton-data-recovery/4/Mountbatten-building-University-of-Southampton-fire-data-recovery 1 March 2011 (Computer Weekly)
U. of York Chemistry - 1980
* "Aftermath of the Chemistry Department fire, May 1980":https://dlib.york.ac.uk/yodl/app/image/detail?id=york%3A13291&ref=browse (U. of York Digital Library)
U. of York History - 1992
* "Fire damage to University of York History Department, Vanbrugh College 1992":https://dlib.york.ac.uk/yodl/app/image/detail?id=york%3a15254&ref=search (U. of York Digital Library)
* "Fire damaged corridor, University of York History Department, Vanbrugh College 1992":https://dlib.york.ac.uk/yodl/app/image/detail?id=york%3a15255&ref=search (U. of York Digital Library)
U. of York, fire in student room - 1993
* "Student room after fire - Goodricke College Block C, 1993":https://dlib.york.ac.uk/yodl/app/image/detail?id=york%3a14934&ref=search (U. of York Digital Library)
U. of York chemistry building
* "University of York chemistry department fire":http://www.bbc.co.uk/news/uk-england-york-north-yorkshire-16857952 2 February 2012 (BBC)
* "Fire at University of York's chemistry department":http://www.yorkpress.co.uk/news/9509335.Fire_at_University_of_York/ 3 February 2012 (York Press)
Strathclyde university engineering department
* "Firefighters tackle blaze at Strathclyde University":http://www.bbc.co.uk/news/uk-scotland-glasgow-west-16938271 7 February 2012 (BBC)
* "Further disruption for Strathclyde teaching students":http://www.journal-online.co.uk/article/9120-further-disruption-for-strathclyde-teaching-students 12 September 2012 (The Journal)
bq. The disruption began on 7 February when 150 students had to be evacuated as a fire started in the Roche Lab in the university's chemical engineering department, forcing the university to relocate lectures across the campus including the Royal College, and Students' Association building on John Street.
University of Glasgow
* "University counts cost of fire damage":http://news.bbc.co.uk/1/hi/scotland/1617643.stm 24 October 2001 (BBC)
bq. Professor Sir Graeme Davies said that a substantial amount of research had been lost in the fire.
* "Riding Out the Storm":http://www.sciencemag.org/content/309/5741/1657.full 9 September 2005 (Science)
* "Displaced Researchers Scramble to Keep Their Science Going":http://www.sciencemag.org/content/309/5743/1980.full 23 September 2005 (Science)
* "New Orleans Labs Start Their Uncertain Comeback":http://www.sciencemag.org/content/310/5752/1267.full 25 November 2005 (Science)
* "One Year After, New Orleans Researchers Struggle to Rebuild":http://www.sciencemag.org/content/313/5790/1038.full 25 August 2006 (Science)
* "Sandy destroyed years of medical research": http://rt.com/usa/news/sandy-research-power-medicine-681/ 31 October 2012
bq. When Hurricane Sandy struck New York, it washed away years of scientific research from the New York University School of Medicine, including genetically modified mice, enzymes, antibodies and DNA strands.
* "NYC Science Stunned by Sandy":http://www.the-scientist.com/?articles.view/articleNo/33109/title/NYC-Science-Stunned-by-Sandy/ 2 November 2012 (The Scientist)
bq. Flooding and blackouts caused by super storm Sandy have had a devastating impact on scores of scientists in the Big Apple, with one research center losing thousands of lab mice as well as precious reagents—a situation that could set some researchers back years.
* "Help for Sandy-Stricken Scientists":http://www.the-scientist.com/?articles.view/articleNo/33223/title/Help-for-Sandy-Stricken-Scientists/ 9 November 2012 (The Scientist)
* "New York research facilities feel Sandy's wrath":http://blogs.nature.com/news/2012/10/new-york-research-facilities-feel-sandys-wrath.html 1 November 2012 (Nature blog)
bq. Although New York University (NYU) was clearly the research facility hardest hit by this week’s storm, others were also affected. Leslie Vosshall, who studies the olfactory system of mosquitoes at Rockefeller University, located about 35 blocks further up river from NYU, shut down a computer server in the basement on Sunday, but fears it could have been damaged from flooding. She has had to wait for the university to pump out the water, before she can check on it. “We do have some of the data backed up elsewhere, but it would set us back significantly.”
* "Sandy’s Toll on Medical Research":http://www.slate.com/articles/health_and_science/science/2012/11/animals_drowned_in_sandy_nyu_medical_research_is_set_back_years_by_dead.html 31 October 2012 (Slate)
bq. In 2001, a tropical storm called Allison flooded Houston with several feet of rain and pushed 10 million gallons of water into the medical-school basements at the University of Texas. The disaster drowned at least 4,000 rats and mice, along with 78 monkeys, 35 dogs, and 300 rabbits. (More than half the animals on campus had been living underground.) Nearby, at the Baylor College of Medicine, basement flooding killed 30,000 mice.
* "Texas researchers regroup after Tropical Storm Allison":http://www.ama-assn.org/amednews/2001/08/13/hlsa0813.htm 13 August 2001 amednews.com
bq. Soaked hard drives and drowned lab animals may delay new medical discoveries by months or years, but hope survives as research facilities dry out.
bq. Tropical Storm Allison's flood caused the following losses at Baylor College of Medicine and the University of Texas-Houston Medical School:
Several hundred rabbits
More than 30,000 transgenic mice and rats
A state-of-the-art MRI machine worth $2 million
Ten years' worth of data on spinal cord injuries
A 20-year collection of 60,000 breast tumor samples
* "Drowned rats":http://www.newscientist.com/article/dn864-drowned-rats.html 12 June 2001 (New Scientist)
bq. As well as destroying research animals, the floodwater has swamped computers. It has also caused power failures, knocking out the refrigerators and freezers used to store samples for research. Back-up cell cultures used for research into cancer at the Baylor College of Medicine will have died, say local officials.
* "Flood descimates building, work at University of Hawaii":http://usatoday30.usatoday.com/weather/news/2004-11-01-hawaii-flood_x.htm 1 November 2004 (USA Today)
bq. HONOLULU — Heavy rain sent water as much as 8 feet deep rushing through the University of Hawaii's main research library Saturday, destroying irreplaceable documents and books, toppling doors and walls and forcing a few students to break a window to escape.
bq. Lyttle's genetic research on the Drosophila goes back 35 years and some of it is irretrievably lost, he said.
bq. McBride and much of the library staff worked all day Sunday to try to save some of the 90,000 photographs stored in the basement along with rare government documents and Hawaiian maps.
bq. The flood also destroyed computers, books, magazines and equipment.
* "Classes canceled at UH on Wednesday":http://www.kpua.net/news.php?id=3652 2 November 2004 (kpua.net)
bq. ...But researchers at the University of Hawaii, which was hard hit, say the flash flood caused untold losses of research damage in computers damaged by flood waters.
h2. Anecdotal Tales Of Lost Data
h3. Recovery of Overwritten Hard Disk Data
5 October 2005 Linux Forums - http://tinyurl.com/8t7uaop
Hi, a friend of mine just overwrote two months of her
PhD thesis with an older version. I know recovery of
overwritten data is possible, but wonder if I'd need
special hardware to do it. Does anyone know something
about this ?
h3. Stolen laptop had PhD research
19 March 2008 Surrey Leader - http://tinyurl.com/9hmtlv4
Thirty-five minutes spent in Langley’s Willowbrook
Shopping Centre cost a Surrey woman much more than
she had anticipated.
Langley RCMP say that while she was shopping from
1-1:35 p.m. last Monday, someone broke into her
vehicle and stole a number of items, including
a Mac iBook laptop containing the research she had
compiled as she worked towards her PhD.
“All that information was on that computer and she
has no back-up file,” said Langley RCMP spokesman
Cpl. Brenda Marshall.
Google images of "Langley Willowbrook":https://www.google.co.uk/search?num=50&hl=en&q=Langley+Willowbrook+Shopping+Centre&&tbm=isch
h3. Happiness is the return of a stolen computer, with data intact
27 May 2010 The Press, NZ - http://tinyurl.com/38sznnh
Never has a man been so happy to see a computer full of data
Claudio De Sassi's world fell apart when a car containing almost three
years work towards his PhD was stolen two weeks ago.
De Sassi, a Canterbury University academic, could not hide his joy
yesterday as police reunited him with his stolen laptop and backpack.
h3. Thugs steal Christmas, doctoral dreams
22 December 2010 KRQE - http://tinyurl.com/9a5j56f
A tiny television sits where a big screen used to, and a Christmas tree
stands with little underneath it...
Even worse than the gifts, the crooks stole a MacBook Pro laptop and a
LaCie hard drive.
The hard drive had … her dissertation and nearly seven years of
research for her doctoral degree she was set to fnish in a few weeks.
Osuna had everything backed up on a separate hard drive in a safe, but
burglars made off with that too.
"All I could think about is that all that time is gone, all that effort,
everything is gone," Osuna said.
h3. Laptop Stolen From OSU Doctoral Student
NBC4i January 06 2011 - http://tinyurl.com/bmybv9x
...her car was broken into and her chrome Mac book pro was stolen.
She has a back-up for all but the last six months of research, but the
most important part of the research had happened recently.
h3. Lost Thesis Poster
PostgraduateForum.com > Current PhD Students, PhD Life. 29 September 2011 - http://tinyurl.com/ct5e2no
I've 'lost' my thesis
Yes, I 'lost' my thesis today, at around 12:42pm (thesis RIP), microsoft word couldn't
cope with the size of the document and my file got corrupted. I'd removed a small chunk
of it and did some formatting to decrease its size yesterday but that obviously didn't
stop it happening. After a few hours trying to recover it, I gave in and called for
help. I then found out that, even if I'd managed to recover it, it probably wouldn't
be the whole document, there could be parts missing, formatting gone awol, etc No sweat
though, I regularly back up my work so it's just today's work that's been lost, well
morning and lunch really as I spent the afternoon attempting to savage it,-) bit
stressful but hey ho, not the end of the world. So for those of you who don't back your
work up, start doing it now! And regularly! I can't possibly imagine what would have
happened to me if I'd really lost everything weeks before submission...
h2. The Lost Laptop Problem
* 2010 Ponemon Institute report for Intel re. US laptops
** On average, 2.3% of laptops assigned to employees are lost each year
** In education & research that rises to 3.7%, with 10.8% of laptops being lost before the end of their useful life
*** ~3 years i.e. within 1 PhD of allocation!
** 75% lost outside the workplace
* Very similar results from 2011 European report!
Intel 2010, The Billion Dollar Lost Laptop Problem - http://tinyurl.com/8c9m4bn
Intel 2011, The Billion Euro Laptop Problem - http://tinyurl.com/9wpbxn9
h2. Laptop Reliability
* 2011 PC World Laptop Reliability Survey from 63,000 readers:
** 22.6% had signifcant problems during the product's lifetime
** Of which...
*** 19% had OS problems ~1 in 25 of all laptops
*** 18% had HDD problems ~1 in 25 of all laptops
*** 10% PSU problems ~1 in 50 of all laptops
PC World 2011 - http://tinyurl.com/876qza5
h2. Hard Disk Failures
* Failure Trends In A Large Disk Drive Population
** Usenix conference on File and Storage Technologies 2007 (FAST '07)
** Eduardo Pinheiro & Wolf-Dietrich Weber, Google Inc.
* Data collected from over 100,000 disk drives at Google
* As part of repairs procedures:
** ~13% of disk drives replaced over 3 years
** ~20% of disk drives replaced over 4 years
[[Failure Trends In A Large Disk Drive Population|More info]]
h2. Data management in the cloud
See JISC/DCC document "Curation In The Cloud" - http://tinyurl.com/8nogtmv
Service agreements may give wide-ranging rights to the data service.
h3. Google Terms Of Service
1 March 2012 Google Terms of Service : http://tinyurl.com/89dc9fa
When you upload or otherwise submit content to our Services, you give
Google (and those we work with) a worldwide license to use, host, store,
reproduce, modify, create derivative works (such as those resulting from
translations, adaptations or other changes we make so that your
content works better with our Services), communicate, publish, publicly
perform, publicly display and distribute such content. The rights you
grant in this license are for the limited purpose of operating, promoting,
and improving our Services, and to develop new ones. This license
continues even if you stop using our Services (for example, for a
business listing you have added to Google Maps).
h3. Microsoft Services Agreement
19 October 2012 Microsoft services agreement : http://tinyurl.com/8e4kucy
When you upload your content to the services, you agree that it may
be used, modifed, adapted, saved, reproduced, distributed, and
displayed to the extent necessary to protect you and to provide, protect
and improve Microsoft products and services. For example, we may
occasionally use automated means to isolate information from email,
chats, or photos in order to help detect and protect against spam and
malware, or to improve the services with new features that makes them
easier to use. When processing your content, Microsoft takes steps to
help preserve your privacy.
h2. Archiving Data
h3. BBC Domesday Project
1986 Project to do a modern-day Domesday book (early crowd-sourcing)
* Used “BBC Master” computers with data on laserdisc
* Collected 147,819 pages of text and 23,225 photos
* Media expiring and obsolete technology put the data at risk!
Domesday Reloaded (2011)
* Required emulation of software
* Images restored from original masters
To allow long-term access to data
* Don't use obscure formats!
* Don't use obscure media!
* Don't rely on technology being available!
* Do keep original source material!
Google images for "BBC Domesday":https://www.google.co.uk/search?tbm=isch&q=bbc+domesday
h2. Sharing Data
Piwowar, Heather A., Roger S. Day, and Douglas B. Fridsma. "Sharing detailed research data is associated with increased citation rate.":http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000308
PLoS One 2.3 (2007): e308.
h2. Related Media
h3. Disk Drives Break
"DataCent collection of disk drive failure sounds":http://datacent.com/hard_drive_sounds.php
h3. Laptops Break / Get Broken
* "Shot laptop":http://lilysussman.wordpress.com/tag/laptop-destroyed/
* "Google images of broken laptops":https://www.google.co.uk/search?q=broken%20laptop&um=1&tbm=isch
h2. More To Read
Albers, S. "Editorial: Well Documented Articles Achieve More Impact":http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1568022
BuR Business Research Journal, Vol. 2, No.2, May 2009
Anderson, Richard G., et al. "The role of data/code archives in the future of economic research.":http://www.tandfonline.com/doi/abs/10.1080/13501780801915574
Journal of Economic Methodology 15.1 (2008): 99-119.
Borgman, Christine L. "The conundrum of sharing research data."
Journal of the American Society for Information Science and Technology 63.6 (2012): 1059-1078.
Campbell, Eric G., et al. "Data withholding in academic genetics."
JAMA: the journal of the American Medical Association 287.4 (2002): 473-480.
Evanschitzky, Heiner, et al. "Replication research's disturbing trend.":http://www.sciencedirect.com/science/article/pii/S0148296306002347
Journal of Business Research 60.4 (2007): 411-415.
Fischer, Beth A., and Michael J. Zigmond. "The essential nature of sharing in science."
Science and engineering ethics 16.4 (2010): 783-799.
Freckleton, R.P., P. Hulme, P. Giller and G. Kerby. 2005. "The changing face of applied ecology.":http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2664.2005.00969.x/full
J. Appl. Ecol. 42:1–3.
Gleditsch, N.P., C. Metelits and H. Strand. 2003. Posting your data: Will you be scooped or will you be famous?.
Int. Stud. Perspect. 4:89–97.
Lancaster, Larry, and Alan Rowe. "Measuring Real World Data Availability.":http://static.usenix.org/publications/library/proceedings/lisa2001/tech/full_papers/lancaster/lancaster_html/
Proceedings of the LISA 2001 15th Systems Administration Conference. 2001.
McCullough, Bruce D., Kerry Anne McGeary, and Teresa D. Harrison. "Lessons from the JMCB Archive.":http://muse.jhu.edu/journals/mcb/summary/v038/38.4mccullough.html
Journal of Money, Credit, and Banking 38.4 (2006): 1093-1107.
Piwowar, Heather A., and Wendy W. Chapman. "Public sharing of research datasets: a pilot study of associations."
Journal of informetrics 4.2 (2010): 148-156.
Piwowar, Heather A., et al. "Towards a data sharing culture: recommendations for leadership from academic health centers."
PLoS medicine 5.9 (2008): e183.
Schroeder, Bianca, and Garth A. Gibson. "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you.":http://www.usenix.org/event/fast07/tech/schroeder/schroeder.pdf
Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST). 2007.
Vandewalle, Patrick, Jelena Kovacevic, and Martin Vetterli. "Reproducible research in signal processing."
Signal Processing Magazine, IEEE 26.3 (2009): 37-47.
Whitlock, Michael C. "Data archiving in ecology and evolution: best practices."
Trends in ecology & evolution 26.2 (2011): 61-65.
Whitlock, Michael C., et al. "Data archiving."
The American Naturalist 175.2 (2010): 145-146.
Wicherts, Jelte M., Marjan Bakker, and Dylan Molenaar. "Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results."
PloS one 6.11 (2011): e26828.