Integration of DSpace with other applications » History » Version 2

Version 1 (Steve Welburn, 2012-05-29 12:20 PM) → Version 2/5 (Marco Fabiani, 2012-05-29 02:33 PM)

h1. Integration of DSpace with other applications

h2. Submitting files to repository

* "Apache FTP Server":http://mina.apache.org/ftpserver/ - has hooks for processing after FTP commands. Could provide basic interface to repository.

* "Storage Resource Broker (SRB) integration with DSpace":https://wiki.duraspace.org/display/DSPACE/DspaceSrbIntegration#DspaceSrbIntegration-MoreInformationabouttheProject (including Registration to ingest data)

* "SymlinkDSpace":https://wiki.duraspace.org/display/DSPACE/SymlinkDSpace Extension to add symlinks to large files instead of uploading them (e.g. videos)

* "DepositMO project":http://www.eprints.org/depositmo/ Has scripts to upload directly from a "watch directory", and also extensions to the SWORD2 protocol.

* "Some suggestions about creating Sword packages":http://dspace.2283337.n4.nabble.com/Producing-mets-xml-for-SWORD-td3285563.html

* "SwordUploader":https://code.soundsoftware.ac.uk/projects/sworduploader

* "DataStage":http://rdm.c4dm.eecs.qmul.ac.uk/datastage-and-dspace -> To work with C4DM's DSpace the way SWORDUPLOADER works, changes are required The file to the file change is (Datastage version 0.3rc2): /usr/lib/python2.6/dist-packages/datastage/dataset/sword2depositor.py .

At line 66, it should read:
<pre>
receipt = conn.create(col_iri=col.href, metadata_entry=e, suggested_identifier=dataset.identifier,in_progress=True)
</pre>

Around line 133, should read:
<pre>
new_receipt = comm.update(dr = receipt,
payload=data,
mimetype="application/zip",
filename=dataset.identifier + "zip",
in_progress=True,
packaging='http://dataflow.ox.ac.uk/package/DataBankBagIt')
</pre>

With these this changes, it should be possible to upload files to DSpace AS CONFIGURED AT C4DM! The modified file can be downloaded from "here":https://code.soundsoftware.ac.uk/attachments/446/sword2depositor.py

h2. File Conversion

* "Xena":http://xena.sourceforge.net converts files to open formats.

h2. Metadata Sources

* "Library of New Zealand Metadata Extractor":http://meta-extractor.sourceforge.net/ - extracts metadata from binary files and outputs as XML.

* "Digital Record Object IDentification - DROID":http://sourceforge.net/projects/droid/files/droid/ - identifies file types and generates summary statistics. Now on version 6.

* "JSTOR/Harvard Object Validation Environment - JHove":http://hul.harvard.edu/jhove/

* "JHove2":http://www.bitbucket.org/jhove2/main - Actually uses DROID 4 for file identification.

* "Apache Tika":http://tika.apache.org/

The SCAlabe Preservation Environments (SCAPE) project compared DROID, Fido, Unix File Utility, FITS and JHove2 for identifying types of files.

"Downloaded from Open Planets Foundation":http://www.openplanetsfoundation.org/system/files/SCAPE_PC_WP1_identification21092011_0.pdf (Attached: attachment:SCAPE_PC_WP1_identification21092011.pdf)

bq. The main difference is that identification is only one part of JHOVE2’s functionality: it also includes feature extraction, validation and policy‐based assessment. These are all outside of the scope of this evaluation. It also means that any computational performance results cannot be directly compared with dedicated identification tools (although JHOVE2’s performance issues appear to be caused mainly by DROID 4, with JHOVE2’s native modules adding very little overhead).

h2. Extending DSpace Metadata Support

* "Semantic web extensions for DSpace":http://simile.mit.edu/ adds _support for arbitrary schemata and metadata_ using semantic web technologies.