Printable Version » History » Version 3

« Previous - Version 3/14 (diff) - Next » - Current version
Steve Welburn, 2012-09-26 02:40 PM


SoDaMaT Wiki

This contains the full content of the SodaMaT Wiki

Sound Data Management Training (SoDaMaT)

(for general information re. Research Data Management please see the parent project Wiki)

Overview

Sound Data Management Training (SoDaMaT) is an eight-month project to create and evaluate discipline-specific data management training material for digital music and audio research. The materials will be targeted to: postgraduate research students (MSc and PhD); research staff (postdoctoral researchers, CIs, PIs); and academic staff. The project is to run at the Centre for Digital Music (C4DM) at Queen Mary University of London (QMUL) from June 2012 to January 2013, in collaboration with the QMUL Learning Institute .

The immediate objectives of the SoDaMaT project are:
  1. to develop specific training material on data management planning for research projects, targeting research and academic staff in digital music and audio research;
  2. to develop training material covering the different aspects of research data management, including subject-specific topics such as music copyrights, for postgraduate students, research and academic staff in the area of digital music and audio;
  3. to collaborate with institutional partners at QMUL (The Learning Institute), other projects (SoundSoftware.ac.uk), and discipline-specific societies (Digital Music Research Network, International Society for Music Information Retrieval ) to test the training material in postgraduate courses, workshops, and tutorials, and to collect feedback on their quality and impact;
  4. to collaborate with institutional partners at QMUL (Learning Institute ; School of Electronic Engineering and Computer Science ; IT Services ) to embed the training material into postgraduate curricula and Continuous Professional Development courses to assure the long-term sustainability and generalisation of the project's results to other similar disciplines.

The requirements will be scoped, and the training materials will be trialled, within the Centre for Digital Music (C4DM), part of the School of Electronic Engineering and Computer Science at QMUL.

In addition to designing, producing, and evaluating discipline-specific training material, a wider objective is to promote good practice in research data management through education and awareness both within QMUL, and across UK and overseas research institutions in the digital music and audio area.

Background

A survey on data management practices among researchers and students at C4DM, conducted during the JISC-funded Sustainable Management of Digital Music Research Data (SMDMRD) project, showed very low awareness of the importance of research data management as part of the research workflow. Although many researchers organise their data in folders and perform semi-regular backups, to the specific question "Do you have a particular strategy for data management", the majority responded negatively. Through our links with other groups via the EPSRC -funded Sound Software project (see Collaborations section), we have good reason to believe the situation is similar in many other music and audio research groups. The results of the survey point to the need for raising awareness of the benefits of research data management, such as a potential increase in citations, understanding and meeting data management requirements set by funding bodies (e.g. EPSRC), and producing sustainable and reproducible research.

The SMDMRD project defined a set of data management policies and created a pilot data management system for C4DM. This was a pioneering effort within QMUL, and a collaboration with the QMUL IT Services has been recently established to adapt the results of the SMDMRD project to define institutional policies, and build an institutional research data repository. Policies can be used to raise awareness among research staff and students by imposing rules of conduct, and adherence to such policies is supported by tools like a data repository. Nevertheless, policies only give a general idea of why research data management is important, and the enthusiasm for using a data management system can easily fade if a culture for data management is not established. These facts point to the need for continuous, embedded, sustainable data management training, with strong focus on promoting the benefits of research data management, ideally from the early stages of a researcher's career.

The Digital Music and Audio Researcher Profile

A wealth of material for training researchers in data management has been produced by previous JISC-funded projects such as Incremental and those in the RDMTrain programme. The Research Data Management Skills Support Initiative (DaMSSI), which collected and compared the results from discipline-specific data management training projects in the RDMTrain programme, in its final report came to the conclusion that "participants respond well to discipline-specific examples and the opportunity to discuss issues with tutors and others in similar disciplines" and that "a discipline-specific approach is more likely to engage students - in many cases principles are the same across disciplines but are more interesting to students if these principles can be seen in the students' own context". DaMSSI also produced three discipline-specific researcher profiles - in the social sciences, in clinical psychology, and in archeology - and two generic data profiles - the conservator and the data manager. We believe that researchers at the Centre for Digital Music, and researchers in similar laboratories or institutions, do not fit in the above-mentioned profiles.

The Centre for Digital Music (C4DM) at QMUL is one of the leading research centres in the field of audio and music technology and signal processing. C4DM makes use of a variety of data as research inputs - most obviously audio datasets - and produces a variety of types of data as research outputs. These outputs include:
  1. manually annotated feature data ("reference annotations") such as expert chord and key transcriptions of existing music recordings which are used as comparative data for evaluating research work;
  2. automatically produced annotations such as those accompanying the publication of methods for audio feature analysis.

The primary targets for the training material to be produced by the proposed project are postgraduate research students, and research and academic staff in C4DM, who perform research over a range of areas including music informatics, machine listening, audio engineering and interaction. C4DM is one of the leading research centres in the field of audio and music technology and signal processing. C4DM makes use of a variety of data as research inputs - most obviously audio datasets - and produces a variety of types of data as research outputs. A common use-case in C4DM research is to run a newly-developed analysis algorithm on a set of audio examples and evaluate the algorithm by comparing its output with that of a human annotator. Results are then compared with published results using the same input data to determine whether the newly proposed approach makes any improvement on the state of the art.

The type of data used in digital music and audio research poses some challenges that need to be addressed in discipline-specific training material. These challenges include:
  1. Copyright: the copyright status of digital music data is often difficult to establish. For example, the owner of internally generated data might be unclear, or data purchased or downloaded from outside might have special license requirements that must be adhered to. This prevents researchers from publishing data in order to avoid unnecessary risk. Addressing this aspect in detail and emphasising the use of less restrictive licenses (e.g. Creative Commons , Open Data Commons ), could lead to a larger amount of data being published in public repositories.
  2. Metadata: the line between data and metadata is often unclear. For example, descriptive metadata (e.g. a song's title, author, year of publication, or key) is in another context used as data. The training material will focus on defining what data and metadata are, on the importance of metadata standards, and on their use, together with standard protocols such as OAI-PMH and SWORD, to exchange data among repositories.
  3. Ethical approval and participant agreement: experimental work based on human responses (e.g. perceptual listening tests) require ethical approval. The lack of information and experience on this topic leads people to write ethics forms that prohibit the release of data, preventing other researchers from reproducing or extending their results, when data could be safely released with the participants' consent if anonymised. %Data is often not published because, for lack of information, the creators tend to be exceedingly "safe" in this respect. The material will include information on how ethical approval works, how to obtain it, and information about publication of sensitive data.

In addition to the recommendations from DaMSSI , the need for specific training material for digital music and audio researchers is justified by at least two additional factors. First, most of the researchers are either computer scientists or electrical engineers and have advanced IT skills. Second, the data is very heterogeneous, rapidly changing, and relatively small in size. As a result, it is usually managed by the creator of the data itself. Thus, the clear separation pointed out by the profiles produced by DaMSSI , as well as in Pryor and Donnelly (2009, p. 165), between the data creator and the data manager/librarian/scientist becomes blurred: all the different aspects can be, and often are, taken care of by the same person.

Evaluation

Strong attention will be payed to evaluate the quality and impact on research practice of the training material. By taking advantage of the established collaborations, the material will be tested in different situations, including postgraduate courses, internal and external seminars and workshops, and tutorials at international conferences. The International Society for Music Information Retrieval (ISMIR) serves the purposes of fostering the exchange of ideas between and among members whose activities, though diverse, stem from a common interest in music information retrieval. A tutorial proposal been submitted in collaboration with the Sound Software project to the 2012 ISMIR conference (8-12 October in Porto, Portugal). A tutorial proposal will also be submitted to DAFx-12 (Digital Audio Effects conference, 17-21 September in York).

The QMUL Learning Institute will provide support and know-how in evaluation methodologies and analysis.

Feedback will be collected using:
  1. anonymous questionnaires after the tutorials/workshops, tailored to the specific audience;
  2. online questionnaires;
  3. standard course evaluation for postgraduate modules;
  4. focus groups interviewed a few months after the training to establish the longer-term impact of the training.

The feedback will be used to iteratively improve the material. Revised versions of all training materials will be available by the end of the project.

Sustainability

We aim to achieve sustainability in the longer term both in the digital music and audio research community, and within QMUL. Our goals are:
  1. to make discipline-specific training sustainable in the digital music and audio research community. Awareness will be raised by presenting the material in collaboration with the Sound Software project at similar UK research institutions, and at discipline-specific conferences (ISMIR and DAFx). Training material will be made available for reuse through the Jorum repository.
  2. to set an example within QMUL. The project will be used as an example by the QMUL Learning Institute , the School of Electronic Engineering and Computer Science, and the IT Services to expand the data management training to other disciplines by adapting the material and methodologies, starting from related research areas such as Signal Processing, and more generally Electronic Engineering and Computer Science. Data management training will be integrated in postgraduate curricula: every PhD student is expected to take part in approximately 210 hours of development activities (including research methods courses) over the course of their studies and the points gained are mapped against the four domains of the Vitae /RCUK Researcher Development Framework . Material for Continuous Professional Development courses for research and academic staff will also be adapted to other disciplines, and all face-to-face training will be complemented by online training material.

Workplan

The work of the project is divided into four work packages (WP):

An overview of the intended content of the work packages is here

Training the Trainers

Additional Notes

References

Pryor, G. and Donnelly, M. (2009). Skilling up to do data: whose role, whose responsibility, whose career? The International Journal of Digital Curation. Vol. 4(2), pp. 158--170.

Research Data Management Skills Support Initiative (DaMSSI) final report

SoDaMat Printable Version

Workplan

The work of the project is divided into four work packages (WP).

WP1 Training Material Design

WP1 Training Material Design

Although the basic principles of data management are valid for both postgraduate students, and research and academic staff, we decided to make a distinction between the two groups (WP1.3 and WP1.4) - a PhD student starting on his project and a PI writing a grant proposal might want to focus on different aspects of data management. The online material (WP1.2) will cover all aspects and be relevant to both groups.

WP1.1 Research Of Available Resources

Results from previous projects (e.g. JISC RDMTrain programme, Research Data Management Skills Support Initiative (DaMSSI), Incremental ), as well as available material from the DCC and other institutions, will be studied and evaluated. Disciplines will be compared and parts of the available material identified that need to be adapted to appeal to researchers in the area of digital music and audio research. In order to integrate the material into the Vitae /RCUK Researcher Development Framework , used to assign credits by the QMUL Learning Institute , the recently released "Information-handling Lens" will also be analysed.

WP1.2 Online Training Material

The Incremental project recommends in its final report (page 21) to "create a collection of webpages to help researchers find tools and assistance". Examples will include FAQs, fact-sheets, online step-by-step guides (e.g. on creating a data management plan for PIs writing a project proposal), short instructional videos (e.g. on how to deposit a data set into a repository, from metadata collection to choosing a license). It will target both new members of staff who could not participate in face-to-face training, and those who need quick reference material or want to learn in greater depth after a seminar. It will also contain information on where to get help for different problems (e.g. copyrights, technical) inside the institution. The online material will be prepared first because it should be already in place when face-to-face training is given.

The online materials have been prepared in the form of a wiki, and are part of this site.

WP1.3 Research Staff Material

Material will be designed that targets research and academic staff involved in funded research projects, although the basic principles will be relevant to students as well. Experience from the Sound Software project showed that different material is useful at different stages of a project. We will thus create a range of training materials to cover some of these stages, to be presented in different formats (e.g. short seminars, tutorials, workshops), and to be integrated by online material. Examples include, but are not limited to:

  1. a five-minute long "executive" pitch on the benefits of data management;
  2. hands-on workshops for CIs and PIs on data management planning for research projects;
  3. conference tutorials giving an overview of research data management;
  4. material for short seminars with in-depth analysis of single aspects of data management such as available tools, policies, and discipline specific challenges.

This material will be presented at internal seminars, discipline-specific conferences and, in collaboration with the Sound Software project, at other institutions in the UK working in the area of digital music and audio research.

WP1.4 Post-Graduate Course Material

Discipline-focused material for face-to-face training sessions will be designed. The material will cover the basics of good data management practise, point out its benefits, and touch on discipline-specific challenges such as copyrights and licenses, with discipline-specific examples, based on the recently developed C4DM Data Management System. Also, the students, as suggested by the DaMSSI project final report (Conclusions, page 15, paragraph 5), will be instructed to create a Data Management Plan for their PhD projects, to be included in their Research Proposal. The material should be sufficient to cover at most one or two sessions in a module. For more in-depth study, the students will then be referred to the online material. The material will be tested first with postgraduate students at C4DM, and then at other research groups in the Digital Music Research Network.

WP1 Deliverables

  • D1.1 Summary and analysis of material already available.
  • D1.2 First draft of the online material.
  • D1.3 First draft of the research staff material.
  • D1.4 First draft of the postgraduate course material.
  • D1.5 Updated version of the research staff material.
  • D1.6 Updated version of the online reference material.
  • D1.7 Updated version of the postgraduate course material.

WP2 Test and evaluation

WP2.1 Evaluation strategies design

With the support of the QMUL Learning Institute workshop questionnaires will be developed based on their prior experience from other projects in order to evaluate the effectiveness of training material and their delivery. The feedback obtained will be used to inform future activities.

WP2.2 Feedback collection and analysis - online material

The online material will be released as early as possible during the project. Continuous online evaluation will be used to collect feedback and make the appropriate changes.

WP2.3 Feedback collection and analysis - research staff material

The workshop material will be tested at various institutions across the UK, and at the ISMIR 2012 conference, where questionnaires will be handed out at the end of each session.

WP2.4 Feedback collection and analysis - postgraduate course material

Feedback for the postgraduate course material will be collected through the standard course evaluation procedures in place at QMUL.

WP2 Deliverables

  • D2.1 Questionnaires for evaluating training material (all types).
  • D2.2 Summary of the collected feedback for the online material and recommendations for improvement.
  • D2.3 Summary of the collected feedback for the research staff material and recommendations for improvement.
  • D2.4 Summary of the collected feedback for the postgraduate course material and recommendations for improvement.

WP3 Embedding

This work package organised the various workshops, courses and seminars in collaboration with the partners.

WP3 Deliverables

  • D3.1 Final report on embedding.

WP4 Communication and Management

WP4.1 Project management

The project will be managed on a day-to-day basis by the PI, with project meetings held weekly to assess progress and problems. This has been our practice throughout the Sound Software project and previous JISC-funded projects. The CIs will participate in the management process to ensure compatibility and continuity with the requirements of the Sound Software project from a management and technical perspective respectively.

WP4.2 Dissemination

The project results will be disseminated through blog posts, Twitter, and official reports on the project's website. Results will also be presented at discipline-specific conferences (ISMIR, DAFx), and to other similar UK-based research institutions via the partnership with the Sound Software project.

WP4 Deliverables

  • D4.1 Project site and feed.
  • D4.2 Final report and publication of the material in the Jorum repository.

References

DaMSSI final report

WP1.1 Research Of Available Resources

Results from previous projects (e.g. JISC RDMTrain programme, Research Data Management Skills Support Initiative (DaMSSI), Incremental ), as well as available material from the DCC and other institutions, will be studied and evaluated. Disciplines will be compared and parts of the available material identified that need to be adapted to appeal to researchers in the area of digital music and audio research. In order to integrate the material into the Vitae /RCUK Researcher Development Framework , used to assign credits by the QMUL Learning Institute , the recently released "Information-handling Lens" will also be analysed.

Previous JISC Projects with Data Management Training Outputs

There are lots of materials relating to data management available through Jorum these include audio interviews, PowerPoint presentations, factsheets, videos and more. Many of these are outputs of previous JISC-funded projects, and we consider some of those here.

The JISC RDMTrain programme funded five discipline-specific research data management training projects in 2010-2011.

Two projects produced online courses:
  • Project CAIRO - Managing Creative Arts Research Data (4 short units)
  • MANTRA - for geosciences, social and political sciences and clinical psychology (a very detailed self-guided course)
The remaining three courses are published as downloadable materials for training sessions:
  • DATUM for Health - 3 sessions
  • DMTpsych - 6 sessions
  • DataTrain - different versions for archaeology (4 sessions) and social anthropology (3 modules targetted at different audiences)
The Supporting Data Management Infrastructure for the Humanities (Sudamih) project at Oxford was funded by JISC under the Research Data Management Infrastructure Programme. Sudamih produced training materials specifically to fit in with the practise of humanities research at Oxford and also released de-localised materials on Jorum:
  • Three slideshows at varying levels of detail including materials targeted at post-doc researchers (Jorum)
  • Research Data Management Factsheet (Jorum)
  • Research Information Management Guides (Jorum)
  • Research Information Management: Organising Humanities Material (Jorum)
  • Research Information Management: Tools for the Humanities (Jorum)

The Incremental project at Glasgow and Cambridge was also part of the JISC Research Data Management Infrastructure programme. Incremental aimed to develop a data management infrastructure by examining existing practises and requirements at the institutions, piloting tools and services to enable data management (examples of proposed outputs included "templates, training, best practice guidelines, and policy") and embedding those outputs withing the institutions. In addition, they aimed to disseminate the results to the wider research community. During the course of the project many training resources were produced and these have been published on Jorum. We provide a summary of some resources on our page on Incremental.

Vitae Researcher Development Framework

The Vitae Researcher Development Framework (RDF) categorises the knowledge, behaviours and attributes of researchers and uses this as a foundation to guide the development of researcher skills.

In April 2012, Vitae published an information literacy component for the RDF.

Information literacy is an umbrella term which encompasses concepts such as digital, visual and media literacies, academic literacy, information handling, information skills, data curation and data management. Interacting with information is at the very heart of research and informed researchers are both consumers and producers of information.

The RDF component included an information literacy lens - mapping information literacy skills onto the RDF researcher model - and an Informed Researcher Booklet giving guidelines to researchers on evaluating and improving their information literacy.

The RDF Information Literacy lens is largely based on the Society of College, National and University Libraries (SCONUL) 7 Pillars Of Information Literacy

Other Vitae RDF Lenses and the more general SCONUL 7 Pillars Of Wisdom may also be of interest.

JISC and the Research Information Network (RIN) co-funded the Research Data Management Skills Support Initiative (DaMSSI) at the DCC. This aimed to examine how the Vitae RDF and the SCONUL 7 Pillars Of Information Literacy could be used to improve the planning of data management training, contributing to the development of the Vitae Information Literacy Lens and Informed researcher booklet, above.

The DaMSSI outputs were (from here):

Of particular interest to the current project are the mappings of previous RDM training projects onto the RDF and the Digital Curation Lifecycle Model.

DCC and Other Institutions

Several UK universities have published materials relation to data management, and, particularly, data management training: In addition, universities in other countries have also published materials:

Although the legal and funder requirements for these organisations will differ from the UK situation, the underlying principles for data management are still the same.

Some UK research councils have published policies regarding data management, data sharing and data curation:

We are in the process of summarising the main Research Council Requirements.

Other organisations have also produced materials related to data management training:

Legislation

Resources For Learning Materials

QMUL resources for e-Learning

Resources available at QMUL for eLearning include:
  • Moodle - a Course Management System (CMS), also known as a Learning Management System (LMS) or a Virtual Learning Environment (VLE).
  • Mahara "open source eportfolios", whatever that means.
  • Articulate for developing online/e-Learning materials
  • qReview lecture capture system
  • Adobe Connect web-conferencing
  • Bristol Online Surveys (QMUL) for developing... surveys

Links

Doctoral Training Centres as catalysts for research data management
RDM training for Postgraduates and Doctoral Training Centres
Open Exeter PGR Workshop on Data Management

DataTrain

DataTrain for archaeology and for social anthropology

Modules are available on Jorum:

Archaeology

Licensed CC-BY-NC-SA

Structure of course:

Modules:

  1. Creating and managing research data in archaeology: an overview
  2. Data lifecycles and management plans
  3. Working with digital data
  4. Rights and digital data
  5. E-Theses and supplementary digital data
  6. Archiving digital data
  7. Post-Graduate data management plans
  8. Project and professional data: data management on post-doctoral research projects and beyond

The teaching modules were run as a trial course in March 2011, as part of a post-graduate course in Digital Skills for Dissertation and Publications, Department of Archaeology, University of Cambridge. The data management course comprised 4 x 2 hour sessions:

  1. Creating and Managing Data - Defining post-graduate research data
  2. Working with Digital Data
    File structure, naming, and formats
    E-theses and supplementary digital data
    Post-Graduate Data Management Plans
  3. Project and Professional Data
    Data management for larger research projects
  4. Archiving and Re-using Data
    Depositing digital data
    Intellectual Property Rights and research data

The slides and notes have been kept as simple and as straight forward as possible. They are not meant to be exhaustive in the information they contain. Rather, they provide an overview of the general issues regarding data management.

Each module has been designed to take approximately 30 minutes to complete. Six of the eight presentations have between 10 and 16 slides (including front title and end acknowledgement slides). The two longer modules are Module 3: Working with Digital Data; and Module 8: Project and Professional Data.

Module 3 (Working with Digital Data) has 38 slides many of which contain a lot of information on different file types and formats. This information has been summarised from the Archaeology Data Service’s Guides to Good Practice, and content most relevant to post-graduate students is presented in a straight forward way. Rather that spending an hour presenting Module 3 in detail (and boring the students to death), it is suggested that the slides be presented as a ‘lightening tour’ of the practical issues of working with digital data. The slides can then be made available for future reference by the students as a handout.

Module 8 (Project and Professional Data) provides an introduction into data management at a higher level of research, including writing AHRC Technical Appendices. While this can be run as a stand alone session, given that this is the desired career path of many doctoral students, and the fact that many doctoral students carry out their research as part of larger projects, the aim of the module is to round off the post-graduate course by looking forward beyond the submission of a PhD Thesis.

Comments regarding discipline-specific nature (from notes for part 1 of course):

Can archaeology be considered in any way a special case in terms of how we create, manage, and archive digital data?
The simple answer is no. The issues of how best to manage digital data and safeguard it preservation in the long term are broadly the same across all disciplines.
The same goes for individual archaeological projects. Even though some might think that their own project is a special case in terms of complicated digital data, or for the fact that they will produce very little in the way of digital data, at the heart of it, the same issues apply, just on a larger or smaller scale.
A key issue which does vary from discipline to discipline is that of what are private data and what are public data. This does arise in archaeology particularly in regard to sensitive data of site or artefact locations, or sensitive personal data collected during the course of a research project.
What perhaps sets archaeology apart from other disciplines is the appreciation of the historical significance of what we do. And the fact that very often, the practice of archaeology is a destructive process and the physical and digital data obtained represent a unique archive – an experiment that cannot be repeated.

However... primary data is often paper-based. Notes, sketches etc.

One area of discipline-specificness is the selection of bodies that provide definitions of good practise and/or archiving facilities (e.g. Archaeology Data Service). Who are these for digital audio research ? AES ? JASA ? ISMIR ? IEEE ? Others ?

Includes details of copyright terms for 8 types of creative works: Literary; Artistic; Sound; Typographic; Broadcasts; Dramatic; Film; and Musical.

For post-grad students, e-Theses are covered. Publishing a digital copy of a thesis makes it "published" and means that all copyright details need to be ironed out.

Part 8 is largely related to resources (arch. specific).

Social Anthropology

A different approach...
  • Basic module - aimed at pre-fieldwork PhD students, fundamentals
  • Advanced module - metadata, ethics, IPR, FoI, data protection, tools
  • Writing-up module - for PhD students and early stage researchers, includes info on long-term archiving

Can be combined to produce a 1-day course.

Mentions reference management. Line between Reserach Data Management and Data Management ?

Lots of info. on data capture - digitizing data.

Points to interesting list of formats from the UK Data Archive":http://www.data-archive.ac.uk

Posting things on CDs/DVDs might be a good idea for infrequent sharing of large amounts of data. Beware of security issues, which can be sidestepped by encryption (more later); and of decay/damage.

In the Advanced module, examples are drawn from the discipline.

DATUM for Health

Comprises 3 sessions: Plus additional notes for:

Downloadable from Jorum (CC-BY-NC-SA)

Session 1: Introduction To Data Management (Northumbria)

  • What is research data ?
  • Where is your research data ?
  • Why manage research data
    • a requirement
    • to work effectively & efficiently
    • to protect it
    • for use and/or re-use
    • to share it
    • for preservation
    • because it is good research practice
  • How to manage research data
  • The research data lifecycle
    • Plan / Create / Analyse / Preserve / Share /Use (and repeat...)
  • Creating a DMP

Session 2: Data Curation Lifecycle (Northumbria)

  • What is data curation ?
  • Why curate ?
    • Requirements
    • Rewards
  • DCC Data Curation Lifecycle Model
    • Conceptualise - planning
    • Create - collection & analysis
    • Appraise - selection
    • Ingest - transferring to a custodian
    • Preserve - keeping data over time
    • Store - keeping data safe
    • Access - finding data
    • Transform - generating new data

Session 3: Problems and practical strategies and solutions (Northumbria)

  • What problems are there ?
    • Conflicting considerations
    • Resource issues
    • anything else ?
  • Conflicts
    • Confidentiality and sharing
      • personal and sensitive data - anonymisation, consent
  • Data security and storage
    • File and folder names
    • Locations
    • Email is not secure
    • Physical security - destroy USB sticks, shred documents
  • Metadata

DMTpsych

Postgraduate training for research data management in the psychological sciences

DMTpsych built upon existing research data management materials developed by the Digital Curation Centre Opens new window (DCC) to create discipline-focused postgraduate training materials that can be embedded into postgraduate research training in the psychological sciences. The materials produced consist of:

  • PowerPoint slides to be used in taught research methods courses
  • Workbook containing psychology specific guidance on completing the DCC’s Online Data Management Planning Tool (including worked examples)
  • A paper copy of the DMPT Opens new window to be completed by students (actually at DCC)

The lectures are structured thematically to match the existing DCC DMPT with the eight key sections forming the centrepiece of six psychology specific lectures and round table discussions.

Deliverables online

Material available for:
  • Overview
  • 1. Historical and Conceptual Issues and Best Practice
  • 2. Introduction and context to psychology-specific DMPT
  • 3a. Access, data sharing and re-use; Legal and ethical issues
    Good detail on Data Protection and FoI. Less good on IPR.
  • 3b. Data standards and capture methods
  • 4. Short-term storage and data management; Deposit and long-term preservation
  • 5. Resourcing; Adherence, review and long-term management
  • 6. Completion of your own Data Management Plan
  • Informed Consent Form

Licensed CC-BY-NC:

This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms.

Written from a psychology perspective... but the content isn't particularly psych.

Incremental

Incremental

This project will build on earlier work by HATII and the DCC to support research data management. It will analyse needs at Glasgow and Cambridge across a number of different disciplines; propose a range of tools or services to address those needs; and develop, adapt and pilot these within each institution. Outputs will then be further adapted and prepared for embedding in local infrastructures and wider dissemination via the Digital Curation Centre, Digital Preservation Coalition, and JISC. The project intends to focus on the provision of softer infrastructure (e.g. templates, training, best practice guidelines, and policy).

Includes multimedia files (audio, video)

Funded by JISC 2010-2011

From Jorum (largely CC-BY-NC-SA)

  • Re-use, sharing, and archiving sensitive research data: a practical overview - slideshow (Jorum)
  • How data centres and repositories can help with research data management (Jorum)
  • University of Glasgow: bidding for grant funding workflow (Jorum)
  • University of Cambridge: bidding for funding workflow (Jorum)
  • The university ethics process and how it impacts on making creative work (Jorum)
  • The benefits of sharing research data (Jorum)
Digital media:
  • Managing music data (Jorum)
  • Managing multimedia research data (Jorum)
  • Working with digital media files (Jorum)
Sensitive data:
  • Archiving sensitive research data (Jorum)
  • Managing sensitive data in performing arts - narrated slideshow (Jorum)
  • Re-use, sharing, and archiving sensitive research data: a practical overview - slideshow (Jorum)
IPR:
  • Intellectual Property Rights and Research Data: Focus on copyright - narrated slideshow (Jorum)
  • Who owns IPR? - flowchart (Jorum)
  • Intellectual property rights (IPR) and the creation and use research materials (Jorum)
  • Intellectual property rights and University of Cambridge: Focus on patents and commercialisation - narrated slideshow (Jorum)
FoI:
  • How the Freedom of Information Act (FOI) applies to research data (Joprum)
  • FAQ for Freedom of Information and Environmental Information Requests for Research Data - narrated slideshow (Jorum)
  • Using the UK Freedom of Information Act: A practical guide for researchers - narrated slideshow (Jorum)
  • What options do researchers have when asked to release their data by an FOI request? (Jorum)
Factsheets:
  • UK research funders' data policies (Jorum)
  • Organising files and folders (Jorum)
  • Adding metadata to Microsoft Office documents (Jorum)
  • Choosing the right digital storage media for you (Jorum)
  • Selecting which data to keep (Jorum)
  • Common Image Formats (Jorum)
  • Selecting which data to keep at University of Glasgow (Jorum)
  • Version control across devices (Jorum)

Incremental Project

Content produced by the Incremental project is released under Creative Commons licence BY-NC-SA

Project site at Cambridge

  • Create
  • Organise
  • Access
  • Look After

MANTRA

geosciences, social and political sciences and clinical psychology

This course is an Open Educational Resource that may be freely used by anyone.
It is available through an open license for re-using, rebranding, repurposing.

License

You are free to re-use part or all of this work elsewhere, with or without modification. In order to comply with the attribution requirements of the Creative Commons license (CC-BY), we request that you cite:

  • the author/creator: EDINA and Data Library, University of Edinburgh
  • the title of the work: Research Data MANTRA [online course]
  • the URL where the original work can be found: http://datalib.edina.ac.uk/mantra

Downloadable from Jorum (DSpace!)

Structure:
  • Introduction
  • Research Data Explained: in depth discussion of what data is and types of data
  • Data Management Plans
  • Organising Data: File naming and storing
  • File Formats
  • Documentation and Metadata
  • Storage and Security
  • Data Protection, Rights and Access (in development / not available)
  • Preservation, Sharing and Licensing (in development / not available)
  • Recommended Resources

Includes videos. Approx. 20 slides per topic (13-30).

Appears to have been built using Xerte

Project CAIRO - Managing Creative Arts Research Data

Online course materials consisting of four units: Downloadable from Jorum:

Introduction to Research Data Management

Examines:

Presents a workflow for arts data management:
Planning -> Creating -> Shaping -> Long-term management

Planning should be good practice - science is largely about data collection and evaluation. Planning this process is part of experimental design. Can you easily meet more than just immediate data needs and contribute to the community at large ?
  • Who's it for ?
  • What documentation is required ?
  • Are there stipulations on data management (timescales, repositories, publish, sensitivity) ?
  • Is assessment required ? How do we enable it ?
  • Are there guidelines we should follow (e.g. institutional) ?
  • If we will publish data, do repositories have requirements for formats ?
Creating is day-to-day working data management:
  • collecting permissions as required
  • documenting data
  • considering file formats
  • backups
Shaping is curation:
  • selection of data
  • extending metadata
  • use of sustainable file formats

Long-term management is after-the-research management of data - NB: the nature of this means that it will involve handing data over to a long-term archive. Occasional activities required (e.g. changing file formats)

AHRC rules are that data needs to be kept for 3 years after a project concludes. (see Research Council Requirements)

Creating Research Data

Focuses on actions before data is created:

HE and FE institutions should ensure that [...] employees and students are aware that, while some exemptions are granted for the use of personal data for research purposes, the majority of the Data Protection Principles must still be conformed to — there is no blanket exemption.
(JISC Data Protection Code of Practice for the HE and FE Sectors (2001))

The simplest way to deal with DPA is to remove personal information. Anonymised data doesn't come under the DPA. So consider carefully whether any personal details in data add to its usefulness or could be removed

Managing Research Data

Delivering Research Data

Identifying issues that come to light after the creation of research data, and overcoming those issues.

WP2.1 Evaluation Strategy Design

In order to evaluate the (training) materials, it is necessary to:
  • identify specific (learning) objectives which they aim to meet
  • evaluate whether the materials meet those objectives
    additionally, we need to
  • identify the overall purposes of the materials
  • evaluate whether the cumulative objectives satisfy these overall purposes

In order to produce the best possible materials, it is necessary to evaluate and then revise the materials. Initial evaluation of the materials will take place once a first draft has been created, but before they are used in training. This will concentrate on the suitability and level of the content. After the initial evaluation and update, the materials will be used in training courses and begin an ongoing series of formative and summative evaluations (i.e. evaluations during and after the training). These evaluations will apply Kirkpatrick's four-level evaluation model1

Methods of Evaluation

Design review

Pre-course (evaluation of materials)

On-going evaluation of course

  • Informal / Formal Review e.g.:
    • Questionnaire to see how easy it is to find relevant material / test users knowledge
    • Focus Groups (I-Tech guide)
  • In-course / formative evaluation (see I-Tech)
    • Assessment of level of knowledge within training group
    • Checking progress with participants
    • Trainer assessment - self assessment and from other trainers if possible
    • Pre- and Post-course questionnaires - assess change in answers (true/false + multiple-choice)
  • Post-course / summative evaluation
    • Debriefing of trainer (did it work ? to time ? did it engage people ?)
    • Questionnaire for participants Sample training evaluation forms
    • Medium-term review of usefulness of course content / adoption of techniques (e.g. 2-3 months after course)
Kirkpatrick's Four Level Evaluation Model
  • Reaction - to the course ("motivation" may be more appropriate)
    • Pacing, was it enjoyable
  • Learning - from the course
    • Did the facts get across ?
  • Behavior - changes after the course
    • Did participants actually manage their data better
      • during research ?
      • at the end of research ?
    • Have data management plans been produced for grant proposals ?
  • Results or Impact - to a wider community
    • Did they publish data ?
    • Was any data loss avoided ?
Review content for:
  • reading level
  • correctness
  • organization
  • ease of use

based on the target audience

Tools and links

Bristol Online Surveys

I-Tech Training Toolkit

Instructional System Design approach to training

Free Managemnt Library - Evaluating Training and Results

Lingualinks Implement A Literacy Program - Evaluating Training

Training Works!... ...what you need to know about managing, designing, delivering, and evaluating group-based training

References

[1] Kirkpatrick, D. L. (1959). Techniques for evaluating training programs. Journal of the American Society of Training Directors, 13, 3–9.

[2] Kirkpatrick, D. L. (1976). Evaluation of training. In R. L. Craig (Ed.), Training and development handbook: A guide to human resource development (2nd ed., pp. 301–319). New York: McGraw-Hill

Additional Notes
Training the Trainers
Legislation
Copyright
Data Protection
Freedom Of information
Research Council Requirements
Resources For Learning Materials
WP1 2 Online Training Material
Data Management By Researcher Need
At The End Of The Research
Before The Research
During The Research
Data Management Skills
Archiving research data
Backing up
Documenting data
Managing Software As Data
Publishing research data
And more repositories
MUSHRA
Research Management