Why do Data Management ?

Evidence Promoting Good Data Management

Data Reuse

Do you reuse other people's data ? Can they reuse your's ?

Whose data is it anyway ?

QMUL HR Contract Terms and Conditions :

16. Patents & Copyright
a) Any discovery, design, computer software program or other work or invention which might reasonably be exploitable (‘Invention’) which is discovered, invented or created by the Employee (either alone or with any other person) either directly or indirectly in the course of their normal duties or in the course of duties specifically assigned to him in the course of his employment shall promptly be disclosed in writing to the College. All intellectual property rights in such Invention shall be the absolute property of the College and the College shall have the right to apply for, prosecute and obtain patent or other similar protection in its own name. Intellectual property rights include all patent rights, copyright and rights in respect of confidential information and know-how. The ownership of copyright in research papers, review articles and books will normally be waived by the College in favour of the author unless subject to any conditions placed on the works by the funder.

The important bit being...

Any ... work ... which might reasonably be exploitable ... which is ... created by the Employee ... in the course of duties ... in the course of his employment ... shall be the absolute property of the College

In the research contract, there is another clause:

The Employee will be expected to publish the results of his/her research work, subject to the conditions of any contract providing funding for the research

Therefore if funding bodies make funding contingent on publishing data as part of the results of research, then data publication will be allowed.

Research policies at QMUL Academic Registry and Council Secretariat

Creative Commons: http://wiki.creativecommons.org/Data CC Licenses / CC0

Science Commons: http://sciencecommons.org/projects/publishing/open-access-data-protocol/

Restrictions based on data ownership

Restrictions based on data parentage - use of e.g. CC-SA data

Article on CC-BY and data

Where possible, CC0 with a request for citations is preferred (Why does Dyad use CC0)

If data is based on copyright works it may be appropriate to restrict the license to allow only research / non-commercial use (e.g. this would prevent chord annnotations being published commercially).

Practical Steps Towards Data Management

Even if you don't have a readily available data repository, there are still steps you can take to manage your data even if it can't be published.

File formats - use open formats where possible to future-proof files.

File naming - give files meaningful names.

Metadata - include a plain-text README file describing the contents of the files.

License - include a plain-text LICENSE file describing the license for the dataset.

Check that a copy of your data will be backed up - e.g. check that the network drive you store your data on is actually backed up.

If you're really bothered about recovering your data make sure it's backed up off-site!

This could be (i) in the cloud (i.e. DropBox etc.); (ii) USB drive (hard/flash); (iii) a specific network location (e.g. a NAS box at home).


The appropriate repository will partly depend upon the data.

It could be... C4DM RDR, Dryad, Flickr, figshare, Archiv.Org...

However, if you want data to be reused in a citable manner remember to package the license and the required citation with the data. It means that however the data reaches the final user the only excuse for not being able to cite the data is that someone has bothered to remove the info...

