Backing up » History » Version 13

« Previous - Version 13/33 (diff) - Next » - Current version
Steve Welburn, 2012-10-05 01:09 PM


Backing up

Why back up your data ?

  • Hard disks die
  • Portable devices can be lost or broken
  • Disasters happen

Laptop Theft

http://www2.nbc4i.com/news/2011/jan/06/osu-phd-student-has-laptop-stolen-ar-352706/

Wednesday night, while at Brazenhead in Grandview for their $3 hamburger night, the most popular night of the week at the restaurant, her car was broken into and her chrome Mac book pro was stolen.

She has a back-up for all but the last six months of research, but the most important part of the research had happened recently.

How to back up data

The core principle is that backup copies of data should be stored in a different location to the main copy.

If you delete your local copy of the data then the primary copy will be the original backup... is that copy backed up anywhere ?

Suitable locations for backups are:
  • A firesafe
  • A network copy
    • An network drive e.g. provided by the institution
    • Internet storage (in the cloud)
    • A data repository - this could be a public thematic / institutional repository for publishing completed research datasets, or an internal repository for archiving datasets during research
  • A portable device / portable media which you keep somewhere other than under your desk

The best backup is the one you do!

Backing up on external devices means that you need access to the device... network drives and "internal" backups are usually more available. e.g. backup every time you're in the office / lab or at home.

Can't I just put it in the cloud ?

Google's terms:

Your Content in our Services

Some of our Services allow you to submit content. You retain ownership of any intellectual property rights that you hold in that content. In short, what belongs to you stays yours.

When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones. This license continues even if you stop using our Services (for example, for a business listing you have added to Google Maps). Some Services may offer you ways to access and remove content that has been provided to that Service. Also, in some of our Services, there are terms or settings that narrow the scope of our use of the content submitted in those Services. Make sure you have the necessary rights to grant us this license for any content that you submit to our Services.

So be careful. I won't be putting important or personal stuff on Google with those terms!

Microsoft SkyDrive from the Windows Live services agreement:

3.3. What does Microsoft do with my content? When you upload your content to the services, you agree that it may be used, modified, adapted, saved, reproduced, distributed, and displayed to the extent necessary to protect you and to provide, protect and improve Microsoft products and services. For example, we may occasionally use automated means to isolate information from email, chats, or photos in order to help detect and protect against spam and malware, or to improve the services with new features that makes them easier to use. When processing your content, Microsoft takes steps to help preserve your privacy.

So pretty similar. Another one to be careful about!

Apple's iCloud is better as they restrict publication rights to data which you want to make public / share:

Except for material we may license to you, Apple does not claim ownership of the materials and/or Content you submit or make available on the Service. However, by submitting or posting such Content on areas of the Service that are accessible by the public or other users with whom you consent to share such Content, you grant Apple a worldwide, royalty-free, non-exclusive license to use, distribute, reproduce, modify, adapt, publish, translate, publicly perform and publicly display such Content on the Service solely for the purpose for which such Content was submitted or made available, without any compensation or obligation to you.

Dropbox is relatively good - probably because they just provide storage and aren't mining it to use in all their other services!

Even so, there are issues. Data stored in the cloud is still stored somewhere... you just don't have control over where that location is. Your data may be stored in a country which gives the government the right to access data. Also, the firm that stores your data may still be required to comply with the laws of its home country when the data is stored elsewhere. It is, however, unlikely that Digital Audio research data will be sensitive enough to find this an issue.

Forbes article on "Can European Firms Legally Use US Clouds To Store Data:http://www.forbes.com/sites/ciocentral/2012/01/02/can-european-firms-legally-use-u-s-clouds-to-store-data/

Both Amazon Web Services and Microsoft have recently acknowledged that they would comply with U.S. government requests to release data stored in their European clouds, even though those clouds are located outside of direct U.S. jurisdiction and would conflict with European laws.

If you do store data in the cloud, consider encrypting it - e.g. using an encrypted .dmg file on a Mac, or using Truecrypt for a cross-platform solution. These create an encrypted "disc" in a file which you can mount and treat like a real disc - but all the content is encrypted. Alternatively, BoxCryptor will encrypt all the individual files in a folder - file names are visible with the standard version, even those are encrypted with the (non-Free) "Unlimited" version.

SpiderOak provide "zero knowledge" privacy in which all data is encrypted locally before being submitted to the cloud, and SpiderOak do not have a copy of your decryption key - i.e. they can't actually examine your data.