Publishing research data » History » Version 69
Steve Welburn, 2013-02-26 10:59 AM
1 | 1 | Steve Welburn | h1. Publishing research data |
---|---|---|---|
2 | 1 | Steve Welburn | |
3 | 45 | Steve Welburn | Research data publication allows your data to be reused by other researchers e.g. to validate your research or to carry out follow-on research. To that end, a suitable data publication host will allow your data to be discovered (e.g. by publishing metadata) and will be publicly accessible (i.e. on the internet). |
4 | 45 | Steve Welburn | |
5 | 1 | Steve Welburn | Research data can be published on the internet through: |
6 | 45 | Steve Welburn | * project web sites |
7 | 45 | Steve Welburn | * research group web-sites |
8 | 2 | Steve Welburn | * generic web archives (e.g. "archive.org":http://archive.org) |
9 | 2 | Steve Welburn | * research data sites (e.g. "figshare":http://figshare.com/) |
10 | 8 | Steve Welburn | * more general open access research hosts (e.g. "f1000 Research":http://f1000research.com/about/) |
11 | 4 | Steve Welburn | * thematic repositories dedicated to a specific discipline / subject area - sadly there is no sign of an appropriate repository for digital music and audio research |
12 | 4 | Steve Welburn | * institutional repositories dedicated to research from a specific organisation (e.g. QMUL have "a repository":https://qmro.qmul.ac.uk/jspui/ through which "Green open access":http://en.wikipedia.org/wiki/Open_access copies of papers by QM research staff can be published). |
13 | 56 | Steve Welburn | * supplementary materials [[Publishing Data Through Journals|attached to journal articles]] |
14 | 1 | Steve Welburn | |
15 | 67 | Steve Welburn | An appropriate [[license]] should be granted to allow other researchers to use your research data. |
16 | 67 | Steve Welburn | |
17 | 69 | Steve Welburn | Within the Centre for Digital Music, we now have a "research data repository":http://c4dm.eecs.qmul.ac.uk/rdr/ for publishing research data outputs from the group. Publishing data though the C4DM repository gives a single point for publishing C4DM data on the internet without relying on (possibly ephemeral) project-specific web-sites. Other repositories that may be of interest to researchers are listed [[And_more_repositories|here]]. |
18 | 1 | Steve Welburn | |
19 | 68 | Simon Dixon | If the web-site through which the data is published is also to be the long-term archive for you data, then you should check that the meets the criteria for an archival storage system. Note that although data will be written to the host irregularly, it is expected that published data will be accessed more frequently than archived data making offline storage unsuitable. |
20 | 1 | Steve Welburn | |
21 | 68 | Simon Dixon | If an external publisher is used for your research data, you should check the Terms and Conditions e.g. to see whether copyright on the data is transferred to the publisher and to check for how long they will publish your data. |
22 | 58 | Steve Welburn | |
23 | 68 | Simon Dixon | If data is published through a publisher or repository, then it may also be held on institutional storage as long as the publisher's license is followed, which might e.g. require that there is a link back to the publisher from the institutional repository. Publishing under a Creative Commons license makes this easy. |
24 | 66 | Steve Welburn | |
25 | 68 | Simon Dixon | If data is available in multiple places, different versions of the data might arise (e.g. changes between dates uploaded, data corruption). You should therefore make it easy to identify which specific version of the data is correct by publishing a "digital fingerprint":http://en.wikipedia.org/wiki/Cryptographic_hash_function (e.g. a "MD5 hash":http://en.wikipedia.org/wiki/MD5). MD5 fingerprints can be generated in Windows using "MD5summer":http://www.md5summer.org/, in Linux with the Gnu "md5sum":http://www.gnu.org/software/coreutils/manual/html_node/md5sum-invocation.html utility and on Max OS X using "md5 or openssl":http://osxdaily.com/2009/10/13/check-md5-hash-on-your-mac/ |
26 | 58 | Steve Welburn | |
27 | 47 | Steve Welburn | h2. Persistent IDs for data |
28 | 48 | Steve Welburn | |
29 | 49 | Steve Welburn | In order to ensure ongoing access to your data, should look to acquire a persistent ID for your dataset. However, persistence is a continuum with some IDs more persistent than others. DOIs and handles are designed to be persistent in the long term, allowing a unique identifier to be redirected to the current location of your dataset - if the dataset moves, the DOI/handle can be pointed at the new location. Repositories and research data sites may provide DOIs for data submitted to them. Institutional URLs may be persistent if the institution makes a policy decision to make them so. Other URLs may change when web-sites are revamped making the published URL for your data return a "404 Not Found" message. |
30 | 46 | Steve Welburn | |
31 | 68 | Simon Dixon | Persistent IDs are useful for referencing datasets, and are particularly handy if they are short. Long or ugly DOIs can be shortened using the "ShortDOI":http://shortdoi.org service. |
32 | 1 | Steve Welburn | |
33 | 61 | Steve Welburn | And [[And more repositories|more repositories]] |