Integration of DSpace with other applications » History » Version 2

Marco Fabiani, 2012-05-29 02:33 PM

1 1 Steve Welburn
h1. Integration of DSpace with other applications
2 1 Steve Welburn
3 1 Steve Welburn
h2. Submitting files to repository
4 1 Steve Welburn
5 1 Steve Welburn
* "Apache FTP Server":http://mina.apache.org/ftpserver/ - has hooks for processing after FTP commands. Could provide basic interface to repository.
6 1 Steve Welburn
7 1 Steve Welburn
* "Storage Resource Broker (SRB) integration with DSpace":https://wiki.duraspace.org/display/DSPACE/DspaceSrbIntegration#DspaceSrbIntegration-MoreInformationabouttheProject  (including Registration to ingest data)
8 1 Steve Welburn
9 1 Steve Welburn
* "SymlinkDSpace":https://wiki.duraspace.org/display/DSPACE/SymlinkDSpace Extension to add symlinks to large files instead of uploading them (e.g. videos)
10 1 Steve Welburn
11 1 Steve Welburn
* "DepositMO project":http://www.eprints.org/depositmo/ Has scripts to upload directly from a "watch directory", and also extensions to the SWORD2 protocol.
12 1 Steve Welburn
13 1 Steve Welburn
* "Some suggestions about creating Sword packages":http://dspace.2283337.n4.nabble.com/Producing-mets-xml-for-SWORD-td3285563.html
14 1 Steve Welburn
15 1 Steve Welburn
* "SwordUploader":https://code.soundsoftware.ac.uk/projects/sworduploader
16 1 Steve Welburn
17 2 Marco Fabiani
* "DataStage":http://rdm.c4dm.eecs.qmul.ac.uk/datastage-and-dspace -> To work with C4DM's DSpace the way SWORDUPLOADER works, changes are required to the file (Datastage version 0.3rc2): /usr/lib/python2.6/dist-packages/datastage/dataset/sword2depositor.py .
18 1 Steve Welburn
19 1 Steve Welburn
At line 66, it should read:
20 1 Steve Welburn
<pre>
21 1 Steve Welburn
receipt = conn.create(col_iri=col.href, metadata_entry=e, suggested_identifier=dataset.identifier,in_progress=True)
22 1 Steve Welburn
</pre>
23 1 Steve Welburn
24 1 Steve Welburn
Around line 133, should read:
25 1 Steve Welburn
<pre>
26 1 Steve Welburn
new_receipt = comm.update(dr = receipt,
27 1 Steve Welburn
                         payload=data,
28 1 Steve Welburn
                         mimetype="application/zip",
29 1 Steve Welburn
                         filename=dataset.identifier + "zip",
30 1 Steve Welburn
                         in_progress=True,
31 1 Steve Welburn
                         packaging='http://dataflow.ox.ac.uk/package/DataBankBagIt')
32 1 Steve Welburn
</pre>
33 1 Steve Welburn
34 2 Marco Fabiani
With these changes, it should be possible to upload files to DSpace AS CONFIGURED AT C4DM! The modified file can be downloaded from "here":https://code.soundsoftware.ac.uk/attachments/446/sword2depositor.py
35 1 Steve Welburn
36 1 Steve Welburn
h2. File Conversion
37 1 Steve Welburn
38 1 Steve Welburn
* "Xena":http://xena.sourceforge.net converts files to open formats.
39 1 Steve Welburn
40 1 Steve Welburn
h2. Metadata Sources
41 1 Steve Welburn
42 1 Steve Welburn
* "Library of New Zealand Metadata Extractor":http://meta-extractor.sourceforge.net/ - extracts metadata from binary files and outputs as XML.
43 1 Steve Welburn
44 1 Steve Welburn
* "Digital Record Object IDentification - DROID":http://sourceforge.net/projects/droid/files/droid/ - identifies file types and generates summary statistics. Now on version 6.
45 1 Steve Welburn
46 1 Steve Welburn
* "JSTOR/Harvard Object Validation Environment - JHove":http://hul.harvard.edu/jhove/
47 1 Steve Welburn
48 1 Steve Welburn
* "JHove2":http://www.bitbucket.org/jhove2/main - Actually uses DROID 4 for file identification.
49 1 Steve Welburn
50 1 Steve Welburn
* "Apache Tika":http://tika.apache.org/
51 1 Steve Welburn
52 1 Steve Welburn
The SCAlabe Preservation Environments (SCAPE) project compared DROID, Fido, Unix File Utility, FITS and JHove2 for identifying types of files. 
53 1 Steve Welburn
54 1 Steve Welburn
"Downloaded from Open Planets Foundation":http://www.openplanetsfoundation.org/system/files/SCAPE_PC_WP1_identification21092011_0.pdf (Attached: attachment:SCAPE_PC_WP1_identification21092011.pdf)
55 1 Steve Welburn
56 1 Steve Welburn
bq. The main difference is that identification is only one part of JHOVE2’s functionality: it also includes feature extraction, validation and policy‐based assessment. These are all outside of the scope of this evaluation. It also means that any computational performance results cannot be directly compared with dedicated identification tools (although JHOVE2’s performance issues appear to be caused mainly by DROID 4, with JHOVE2’s native modules adding very little overhead).
57 1 Steve Welburn
58 1 Steve Welburn
h2. Extending DSpace Metadata Support
59 1 Steve Welburn
60 1 Steve Welburn
* "Semantic web extensions for DSpace":http://simile.mit.edu/ adds _support for arbitrary schemata and metadata_ using semantic web technologies.