Wiki

NB: All this may be superceded by more recent versions of DataStage, DSpace and the SWORDv2 server!

According to Marco's blog post it looks as though a specific ingester isn't necessary, but could be used to process the manifest.rdf to add metadata.

DataStage under Virtualbox

The DataStage server doesn't start properly in VirtualBox. In order to submit files, it is necessary to:

sudo datastage-server stop
sudo datastage-server start

SWORDv2 Server

If using DSpace 1.8.2, the java swordv2 server library (/system/webapps/swordv2/WEB-INF/server-2.0-classes.jar) MUST be removed and substituted with the latest version from https://github.com/swordapp/JavaServer2.0 .

Tomcat configuration

Tomcat must allow access to the swordv2 DSpace webapp (plausibly in the Tomcat server.xml). If a DSpace instance has been restricted to core functionality, only the JSP or XML UI elements may be available. Test by trying to access

swordv2/servicedocument

under the DSpace home URL. With password authentication, this should prompt for a username and password and allow the Sword service document to be retrieved.

e.g. on our test server it is at: http://c4dm.eecs.qmul.ac.uk/smdmrd-test/swordv2/servicedocument

        <Context path="/dspace/swordv2" docBase="/PathToDspace/webapps/swordv2" debug="0" 
                reloadable="true" cachingAllowed="false" 
                allowLinking="true"/>

Tomcat ran out of permanent generation memory when using SWORD interface - this was then updated in tomcat6.comf by adding:

JAVA_OPTS = "-XX:MaxPermSize=256m" 

DataStage

To work with C4DM's DSpace the way SWORDUPLOADER works, changes are required to the file (Datastage version 0.3rc2): /usr/lib/python2.6/dist-packages/datastage/dataset/sword2depositor.py .

At line 66, it should read:

receipt = conn.create(col_iri=col.href, metadata_entry=e, suggested_identifier=dataset.identifier,in_progress=True)

Around line 133, should read:

new_receipt = comm.update(dr = receipt,
                         payload=data,
                         mimetype="application/zip",
                         filename=dataset.identifier + "zip",
                         in_progress=True,
                         packaging='http://dataflow.ox.ac.uk/package/DataBankBagIt')

With these changes, it should be possible to upload files to DSpace AS CONFIGURED AT C4DM! The modified file can be downloaded from here