Wiki » History » Version 3

Luis Figueira, 2011-09-20 04:08 PM
added a paragraph

1 1 Chris Cannam
h1. Outline notes for ICASSP 2012 paper submission
2 1 Chris Cannam
3 1 Chris Cannam
General form of a paper:
4 1 Chris Cannam
5 1 Chris Cannam
# What problem we tried to solve
6 1 Chris Cannam
# How we tried to solve it
7 1 Chris Cannam
# How well it worked
8 1 Chris Cannam
9 1 Chris Cannam
h2. What is the problem here?
10 1 Chris Cannam
11 1 Chris Cannam
(Can we add references here!?)
12 1 Chris Cannam
13 1 Chris Cannam
# Research in this field involves developing software
14 1 Chris Cannam
# That software typically is not published
15 1 Chris Cannam
# Consequently it's hard to get hold of reference implementations of significant algorithms or to reproduce results from papers
16 1 Chris Cannam
17 1 Chris Cannam
We did a survey -- but can we package its results in a way that provide any sense of scientific rigour?
18 1 Chris Cannam
19 1 Chris Cannam
It found:
20 1 Chris Cannam
21 1 Chris Cannam
# People use lots of different languages and environments
22 1 Chris Cannam
# Many people don't share their code
23 1 Chris Cannam
# A surprising number asserted that they did not intend to publish any code, and that their code never left their own computer
24 1 Chris Cannam
25 1 Chris Cannam
We also observed that our own facilities were not ideal and not being used to best advantage.
26 1 Chris Cannam
27 1 Chris Cannam
We also made some observations in the Autumn School (though this is getting a bit circular since the Autumn School was one of the things we've been trying to do as well):
28 1 Chris Cannam
29 1 Chris Cannam
# Many attendees had never used version control before
30 1 Chris Cannam
# When its benefits were shown to them, they were generally very receptive & positive (version control was identified as a good point in the programme)
31 1 Chris Cannam
# But they found it trickier than I had expected to get going with the Subversion client used in the workshop
32 1 Chris Cannam
33 1 Chris Cannam
h2. What have we been trying to do?
34 1 Chris Cannam
35 1 Chris Cannam
# Promote collaborative development from the outset -- if we can encourage people to work together on code even just as much as they would normally work together on a paper, then we will increase their comfort with disclosing code later -- this faces some tricky cultural obstacles though (e.g. necessity to convince supervisors etc that you did your own work on your own)
36 1 Chris Cannam
# Provide facilities and services that people can use and educate them to make best use of them (or of any facilities they already have)
37 1 Chris Cannam
# Get hands-on, taking care of code that people really want to use
38 1 Chris Cannam
39 1 Chris Cannam
h2. What have we done so far?
40 1 Chris Cannam
41 1 Chris Cannam
# Autumn School
42 1 Chris Cannam
# Code repository site
43 1 Chris Cannam
# EasyMercurial
44 1 Chris Cannam
45 1 Chris Cannam
h2. How well has it worked?
46 1 Chris Cannam
47 1 Chris Cannam
h2. What will we do next?
48 1 Chris Cannam
49 1 Chris Cannam
# More learning materials
50 1 Chris Cannam
# Follow-ups to Autumn School
51 1 Chris Cannam
# Visits to other UK research institutions
52 1 Chris Cannam
53 1 Chris Cannam
54 2 Chris Cannam
h2. Thoughts from Mark
55 2 Chris Cannam
56 2 Chris Cannam
Concrete examples are always nice.
57 2 Chris Cannam
58 2 Chris Cannam
We should aim to make a case: "I could use this facility", or "I could copy this facility"; "I could apply this method"
59 2 Chris Cannam
60 2 Chris Cannam
General scientific paper principle: Background (what other people have tried, and why it is lacking); our solution; evaluation.  For background we can discuss the general philosophy of software development and how software is actually developed in academia (with stats).
61 2 Chris Cannam
62 2 Chris Cannam
The Reproducible Research movement shows a similar intention but does not address quite the same problem -- it is somewhat orthogonal.  If a researcher does not elect to make their results available in a full RR form, they nonetheless still have a problem with how their software is actually made -- we can make incremental improvements to software development practice (a bottom-up approach) and can help people get better results no matter how they publish their work.  Sustainability and reusability are relevant even if you do not intend ever to publish your code openly.
63 2 Chris Cannam
64 2 Chris Cannam
[However, encouraging openness in software development should lead to greater comfort with the idea of reproducible research as well.]
65 2 Chris Cannam
66 2 Chris Cannam
In future work, we can consider how to handle really big data -- and how to handle copyright etc -- in a way that is less ad-hoc than is typically the case at present.
67 2 Chris Cannam
68 3 Luis Figueira
It is important to explain how is our project different -- Reproducible Research is a tool, not an end on itself. We aim at sustainability by tying together code versioning, documentation, code availability, etc. The idea of a community (UK wide) is fundamental to achieve our goals. 
69 3 Luis Figueira
70 3 Luis Figueira
71 3 Luis Figueira
72 3 Luis Figueira
73 3 Luis Figueira
74 2 Chris Cannam
h2. Reproducible Research References
75 1 Chris Cannam
76 1 Chris Cannam
EPFL Page on RR:
77 1 Chris Cannam
* http://lcav.epfl.ch/reproducible_research
78 1 Chris Cannam
79 1 Chris Cannam
EPFL's Repository:
80 1 Chris Cannam
* http://rr.epfl.ch/
81 1 Chris Cannam
82 1 Chris Cannam
Some papers worth reading (http://reproducibleresearch.net/index.php/RR_links)
83 1 Chris Cannam
84 1 Chris Cannam
* WaveLab and Reproducible Research (J. B. Buckheit and D. L. Donoho, )
85 1 Chris Cannam
** Dept. of Statistics, Stanford University, Tech. Rep. 474, 1995.
86 1 Chris Cannam
** http://www-stat.stanford.edu/~donoho/Reports/1995/wavelab.pdf
87 1 Chris Cannam
* Reproducible Research: The bottom line (de Leeuw, Jan)
88 1 Chris Cannam
** Department of Statistics, UCLA, Department of Statistics Papers, March 2001
89 1 Chris Cannam
** http://escholarship.org/uc/item/9050x4r4#page-1
90 1 Chris Cannam
* Pushing Science into Signal Processing (M. Barni and F. Perez-Gonzalez)
91 1 Chris Cannam
** IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 119–120, July 2005.
92 1 Chris Cannam
** http://www.gts.tsc.uvigo.es/~fperez/docs/pushing_science.pdf
93 1 Chris Cannam
* Sharing Detailed Research Data Is Associated with Increased Citation Rate (H. A. Piwowar, R. S. Day, and D. B. Fridsma)
94 1 Chris Cannam
** PLoS ONE, March 2007
95 1 Chris Cannam
** http://www.plosone.org/article/info:doi/10.1371/journal.pone.0000308
96 1 Chris Cannam
* How to encourage and publish reproducible research (J. Kovačević)
97 1 Chris Cannam
** ICASSP 2007
98 1 Chris Cannam
** http://lcav.epfl.ch/files/content/sites/lcav/files/reproductible_research/ICASSP07/Kovacevic07_pres.pdf
99 1 Chris Cannam
*  Reproducible Research in Signal Processing - What, why, and how (P. Vandewalle, J. Kovacevic and M. Vetterli,)
100 1 Chris Cannam
** IEEE Signal Processing Magazine, May 2009
101 1 Chris Cannam
** http://rr.epfl.ch/17/
102 1 Chris Cannam
103 1 Chris Cannam
Maybe some similarities?
104 1 Chris Cannam
* SHARE: a web portal for creating and sharing executable research papers (P. Van Gorp and S. Mazanek)
105 1 Chris Cannam
** Proc. International Conference on Computational Science, 2011