Wiki » History » Version 8

« Previous - Version 8/12 (diff) - Next » - Current version
Marco Fabiani, 2012-05-29 12:34 PM


Wiki

Usage

SWORD2 DSpace bulk uploader
--------------------

A python script to submit large numbers of files to a SWORD2-compatible repository, specifically DSpace 1.8x.
Built on the SWORD2 python client library: https://github.com/swordapp/python-client-sword2

-----------------------------------------------
Installation:

- no installation required, simply copy the script sworduploader.py to a suitable location. The first time you run the script, it will create the sword2_logging.conf file.
- a server.cfg file is also available. If the --servicedoc option is not used, sworduploader will read the first line of server.cfg and use it as the server's URL. If the server.cfg is missing, it will default to C4DM's server.

-----------------------------------------------
Usage:

sworduploader[-h] [--username USER_NAME] [--title TITLE]
                        [--author AUTHOR [AUTHOR ...]] [--date DATE]
                        [--servicedoc DSPACEURL]
                        data

Bulk upload to DSpace using SWORDv2.

positional arguments:
  data                  Accepts: METSDSpaceSIP and BagIt packages, simple zip
                        files, directories, single files. NOTE: METSDSpaceSIP
                        packages are only accepted by Collections with a
                        workflow!

optional arguments:
  -h, --help            show this help message and exit
  --username USER_NAME  DSpace username.
  --title TITLE         Title (ignored for METS packages).
  --author AUTHOR [AUTHOR ...]
                        Author(s) (ignored for METS packages). Accepts
                        multiple entries in the format "Surname, Name" 
  --date DATE           Date of creation (string) (ignored for METS packages).
  --zip                 If "data" is a directory, compress it and post it as a
                        single file. The zip file will be saved along with the
                        individual files.
  --servicedoc SD          Url of the SWORDv2 service document (default: use
                        server.cfg if available, otherwise http://c4dm.eecs.qm
                        ul.ac.uk/rdr/swordv2/servicedocument

If the submission is created successfully, it will remain open to be completed with the necessary metadata and licenses, using the DSpace web interface. The submission can be found in the "My Account -> Submissions" section of the user's area.

Updates

Version 0.6:
- Uploading a directory will maintain the path structure (i.e. subdirectories).
- A different server can be specified in the server.cfg file. This is overridden if the --servicedoc option is used. If the file is missing, sworduploader will default to C4DM's repository.

Possible problems

Version 0.4 is based on the modified python-sword2 library that can be found on bitbucket/marcofabiani.

The script has been tested with the version now on github (richardjones), and it works IF DSpace is patched with the latest version of the sword-server-2.0 which corrects a mistake in the service document.

Issue 1: Service document verification fails (DSpace server)
https://bitbucket.org/richardjones/python-sword2/issue/1/service-document-verification-fails-dspace

richardjones on Fri, 20 Apr 2012 12:45:26 +0200:

The bug was in the Java common library used by default

See http://tools.ietf.org/html/draft-gregorio-atompub-multipart-04 for details of multipart support requirements.

This has now been fixed in the java server library, which is hosted at http://sword-app.svn.sourceforge.net/viewvc/sword-app/JavaServer2.0/trunk/

Changes:
status: new -> resolved

Executables

The executables for Mac and Windows have been produced using PyInstaller:"http://www.pyinstaller.org/". Note that currently (v0.6) building the executables fails when using the latest version of the swordv2 python libraries from GitHub, although the script works when used with python. Thus, the executables are based on a different version of the the libraries.