SWORD2 DSpace bulk uploader
A python script to submit large numbers of files to a SWORD2-compatible repository, specifically DSpace 1.8x.
Built on the SWORD2 python client library: https://github.com/swordapp/python-client-sword2
- no installation required, simply copy the script sworduploader.py to a suitable location. The first time you run the script, it will create the sword2_logging.conf file.
-- However it does have two prerequisites:
---- The version available via pip is incompatible with this uploader
---- The version at https://github.com/swordapp/python-client-sword2 is the official version, but works with a 30 second timeout
---- The version at https://github.com/sjwqmul/python-client-sword2 is out fork, and allows the timeout value to be specified
-- lxml may be required by the swordv2 client under certain (undefined) circumstances...
- a server.cfg file is also available. If the --servicedoc option is not used, sworduploader will read the first line of server.cfg and use it as the server's URL. If the server.cfg is missing, it will default to C4DM's server.
sworduploader[-h] [--username USER_NAME] [--title TITLE] [--author AUTHOR [AUTHOR ...]] [--date DATE] [--servicedoc DSPACEURL] data Bulk upload to DSpace using SWORDv2. positional arguments: data Accepts: METSDSpaceSIP and BagIt packages, simple zip files, directories, single files. NOTE: METSDSpaceSIP packages are only accepted by Collections with a workflow! optional arguments: -h, --help show this help message and exit --username USER_NAME DSpace username. --title TITLE Title (ignored for METS packages). --author AUTHOR [AUTHOR ...] Author(s) (ignored for METS packages). Accepts multiple entries in the format "Surname, Name" --date DATE Date of creation (string) (ignored for METS packages). --zip If "data" is a directory, compress it and post it as a single file. The zip file will be saved along with the individual files. --servicedoc SD Url of the SWORDv2 service document (default: use server.cfg if available, otherwise http://c4dm.eecs.qm ul.ac.uk/rdr/swordv2/servicedocument
If the submission is created successfully, it will remain open to be completed with the necessary metadata and licenses, using the DSpace web interface. The submission can be found in the "My Account -> Submissions" section of the user's area.
- Uploading a directory will maintain the path structure (i.e. subdirectories).
- A different server can be specified in the server.cfg file. This is overridden if the --servicedoc option is used. If the file is missing, sworduploader will default to C4DM's repository.
Version 0.4 is based on the modified python-sword2 library that can be found on bitbucket/marcofabiani.
The script has been tested with the version now on github (richardjones), and it works IF DSpace is patched with the latest version of the sword-server-2.0 which corrects a mistake in the service document.
Issue 1: Service document verification fails (DSpace server)
richardjones on Fri, 20 Apr 2012 12:45:26 +0200:
The bug was in the Java common library used by default
See http://tools.ietf.org/html/draft-gregorio-atompub-multipart-04 for details of multipart support requirements.
This has now been fixed in the java server library, which is hosted at http://sword-app.svn.sourceforge.net/viewvc/sword-app/JavaServer2.0/trunk/
status: new -> resolved
The executables for Mac and Windows have been produced using PyInstaller. Note that currently (v0.6) building the executables fails when using the latest version of the swordv2 python libraries from GitHub, although the script works when used with python. Thus, the executables are based on a different version of the the libraries, which are included here under sword2-libraries-pyinstaller-compatible.