Integration of DSpace with other applications » History » Version 4

« Previous - Version 4/5 (diff) - Next » - Current version
Steve Welburn, 2012-07-10 09:41 AM


Integration of DSpace with other applications

Submitting files to repository

  • Apache FTP Server - has hooks for processing after FTP commands. Could provide basic interface to repository.
  • SymlinkDSpace Extension to add symlinks to large files instead of uploading them (e.g. videos)
  • DepositMO project Has scripts to upload directly from a "watch directory", and also extensions to the SWORD2 protocol.

DataStage

To work with C4DM's DSpace the way SWORDUPLOADER works, changes are required to the file (Datastage version 0.3rc2): /usr/lib/python2.6/dist-packages/datastage/dataset/sword2depositor.py .

At line 66, it should read:

receipt = conn.create(col_iri=col.href, metadata_entry=e, suggested_identifier=dataset.identifier,in_progress=True)

Around line 133, should read:

new_receipt = comm.update(dr = receipt,
                         payload=data,
                         mimetype="application/zip",
                         filename=dataset.identifier + "zip",
                         in_progress=True,
                         packaging='http://dataflow.ox.ac.uk/package/DataBankBagIt')

With these changes, it should be possible to upload files to DSpace AS CONFIGURED AT C4DM! The modified file can be downloaded from here

Additionally, the DataStage server doesn't start properly in VirtualBox. In order to submit files, it is necessary to:

sudo datastage-server stop
sudo datastage-server start

Also see, this blog post

File Conversion

  • Xena converts files to open formats.

Metadata Sources

  • JHove2 - Actually uses DROID 4 for file identification.

The SCAlabe Preservation Environments (SCAPE) project compared DROID, Fido, Unix File Utility, FITS and JHove2 for identifying types of files.

Downloaded from Open Planets Foundation (Attached: attachment:SCAPE_PC_WP1_identification21092011.pdf)

The main difference is that identification is only one part of JHOVE2’s functionality: it also includes feature extraction, validation and policy‐based assessment. These are all outside of the scope of this evaluation. It also means that any computational performance results cannot be directly compared with dedicated identification tools (although JHOVE2’s performance issues appear to be caused mainly by DROID 4, with JHOVE2’s native modules adding very little overhead).

Extending DSpace Metadata Support