Publishing research data » History » Version 60

Steve Welburn, 2012-11-16 04:33 PM

1 1 Steve Welburn
h1. Publishing research data
2 1 Steve Welburn
3 45 Steve Welburn
Research data publication allows your data to be reused by other researchers e.g. to validate your research or to carry out follow-on research. To that end, a suitable data publication host will allow your data to be discovered (e.g. by publishing metadata) and will be publicly accessible (i.e. on the internet).
4 45 Steve Welburn
5 1 Steve Welburn
Research data can be published on the internet through:
6 45 Steve Welburn
* project web sites
7 45 Steve Welburn
* research group web-sites
8 2 Steve Welburn
* generic web archives (e.g. "archive.org":http://archive.org)
9 2 Steve Welburn
* research data sites (e.g. "figshare":http://figshare.com/)
10 8 Steve Welburn
* more general open access research hosts (e.g. "f1000 Research":http://f1000research.com/about/)
11 4 Steve Welburn
* thematic repositories dedicated to a specific discipline / subject area - sadly there is no sign of an appropriate repository for digital music and audio research
12 4 Steve Welburn
* institutional repositories dedicated to research from a specific organisation (e.g. QMUL have "a repository":https://qmro.qmul.ac.uk/jspui/ through which "Green open access":http://en.wikipedia.org/wiki/Open_access copies of papers by QM research staff can be published).
13 56 Steve Welburn
* supplementary materials [[Publishing Data Through Journals|attached to journal articles]]
14 1 Steve Welburn
15 51 Steve Welburn
Within the Centre for Digital Music, we now have a "research data repository":http://c4dm.eecs.qmul.ac.uk/rdr/ for publishing research data outputs from the group.
16 1 Steve Welburn
17 45 Steve Welburn
If the publication web-site is also to be the long-term archive for you data, you should check that the meets the criteria for an archival storage system. However, although data will be written to the host irregularly, it is expected that the data will be accessed more frequently than archived data. Offline storage is therefore not suitable.
18 1 Steve Welburn
19 58 Steve Welburn
If an external publisher is used for your research data, you should check the T&Cs e.g. to see whether copyright on the data is transferred to the publisher and to check how long they will publish your data for.
20 58 Steve Welburn
21 58 Steve Welburn
If data is published through a publisher / repository, then it can also be held on institutional storage so long as the license is followed. Publishing under a Creative Commons license makes this easy. However, if data is available in multiple places, different versions of the data may occur (e.g. changes between dates uploaded, data corruption). You should therefore make it easy to identify which specific version of the data is correct by publishing a "digital fingerprint":http://en.wikipedia.org/wiki/Cryptographic_hash_function (e.g. a "MD5 hash":http://en.wikipedia.org/wiki/MD5). MD5 fingerprints can be generated in Windows using "MD5summer":http://www.md5summer.org/, in Linux with the Gnu "md5sum":http://www.gnu.org/software/coreutils/manual/html_node/md5sum-invocation.html utility and on Max OS X using "md5 or openssl":http://osxdaily.com/2009/10/13/check-md5-hash-on-your-mac/
22 58 Steve Welburn
23 47 Steve Welburn
h2. Persistent IDs for data
24 48 Steve Welburn
25 49 Steve Welburn
In order to ensure ongoing access to your data, should look to acquire a persistent ID for your dataset. However, persistence is a continuum with some IDs more persistent than others. DOIs and handles are designed to be persistent in the long term, allowing a unique identifier to be redirected to the current location of your dataset - if the dataset moves, the DOI/handle can be pointed at the new location. Repositories and research data sites may provide DOIs for data submitted to them. Institutional URLs may be persistent if the institution makes a policy decision to make them so. Other URLs may change when web-sites are revamped making the published URL for your data return a "404 Not Found" message.
26 46 Steve Welburn
27 47 Steve Welburn
Persistent IDs are useful for referencing datasets, and are particularly handy if they are short - long / ugly DOIs can be shortened using the "ShortDOI":http://shortdoi.org service.
28 1 Steve Welburn
29 60 Steve Welburn
h2. Available Repositories
30 1 Steve Welburn
31 6 Steve Welburn
The "Digital Curation Centre (DCC)":http://www.dcc.ac.uk/ have a (very short) "list of repositories":http://www.dcc.ac.uk/resources/external/repositories .
32 6 Steve Welburn
33 6 Steve Welburn
Repositories using DSpace can be registered on the DSpace web-site, for inclusion in the list of "Who's using DSpace ?":http://www.dspace.org/whos-using-dspace .
34 3 Steve Welburn
35 1 Steve Welburn
Within the University of London, the "School of Advanced Study":http://sas.ac.uk/ has a "repository":http://sas-space.sas.ac.uk/ of humanities-related items.
36 1 Steve Welburn
37 3 Steve Welburn
"University of the Arts London":http://arts.ac.uk/ have an online "repository":http://ualresearchonline.arts.ac.uk/
38 3 Steve Welburn
39 3 Steve Welburn
"Edina":http://edina.ac.uk/ provides a national data centre
40 3 Steve Welburn
41 3 Steve Welburn
bq. EDINA is a UK national academic data centre, designated by JISC on behalf of UK funding bodies to support the activity of universities, colleges and research institutes in the UK, by delivering access to a range of online data services through a UK academic infrastructure, as well as supporting knowledge exchange and ICT capacity building, nationally and internationally.
42 3 Steve Welburn
43 3 Steve Welburn
Services hosted at EDINA include:
44 3 Steve Welburn
* JISC "Mediahub":http://www.jiscmediahub.ac.uk
45 3 Steve Welburn
* "OpenDepot":http://opendepot.org/ open access to journal papers
46 1 Steve Welburn
* Mapping data
47 45 Steve Welburn
48 45 Steve Welburn
Pre-press e-Prints of articles can be published through http://arxiv.org/ and the related "Computing Research Repository":http://arxiv.org/corr/home
49 31 Steve Welburn
50 31 Steve Welburn
[[And more repositories]]