CodeHosting » History » Version 2
Chris Cannam, 2010-09-20 12:20 PM
1 | 1 | Chris Cannam | h1. The code hosting problem |
---|---|---|---|
2 | 1 | Chris Cannam | |
3 | 1 | Chris Cannam | h2. Assumptions found in my head |
4 | 1 | Chris Cannam | |
5 | 1 | Chris Cannam | * Audio and music research groups in institutions *lack effective access to version control systems* |
6 | 1 | Chris Cannam | |
7 | 1 | Chris Cannam | * This is certainly historically true of C4DM; what about other groups? |
8 | 1 | Chris Cannam | |
9 | 1 | Chris Cannam | * Researchers often want to *share their code selectively* with other researchers in the same field but in other institutions |
10 | 1 | Chris Cannam | |
11 | 1 | Chris Cannam | * Internal code hosting doesn't usually facilitate this |
12 | 1 | Chris Cannam | |
13 | 1 | Chris Cannam | * Individual researchers may be happy to host their code in *existing public hosting services* (e.g. SourceForge, Google Code), but their supervisors are likely to be less keen |
14 | 1 | Chris Cannam | |
15 | 2 | Chris Cannam | * Supervisors don't necessarily appreciate these services' requirement that everything should be open source, and it's hard to keep track of what work your students are producing |
16 | 1 | Chris Cannam | |
17 | 2 | Chris Cannam | * The opposite dynamic may occur in some places -- researchers may be self-conscious about publishing code even when their supervisors encourage them to |
18 | 2 | Chris Cannam | |
19 | 1 | Chris Cannam | _How can we test these assumptions?_ |
20 | 1 | Chris Cannam | |
21 | 1 | Chris Cannam | _If these assumptions are correct, how do we solve these problems?_ |
22 | 1 | Chris Cannam | |
23 | 1 | Chris Cannam | h3. We could encourage and train institutions to provide better internal code hosting facilities |
24 | 1 | Chris Cannam | |
25 | 2 | Chris Cannam | For example, by providing nice recipes, templates, support etc for setting up well-featured, friendly services. A good code management facility would bring together a version control system with a nice web front-end, project data sharing facilities (wiki etc), a sensible authentication system that doesn't involve a whole new username/password database, etc. |
26 | 1 | Chris Cannam | |
27 | 1 | Chris Cannam | This is certainly likely to improve code development practice in an institution that has no facility at present. But it doesn't really solve the "selective sharing" problem, or help very much with the desire to move toward publication of software and reproduceable research -- unless we can also convince people to make their own internal hosting facility a public one. |
28 | 1 | Chris Cannam | |
29 | 2 | Chris Cannam | Audio and music research groups typically are too small to be successfully running their own facilities. To do this well, they really need a horizontal approach -- facilities provided to all research areas by a common CS or IT service. This isn't necessarily the most effective approach if we want to improve search and access specifically to audio-related research code, but it may be the easiest approach to maintain. This is (presumably?) the sort of thing that the general Software Sustainability Institute ought to be exercising itself with. |
30 | 1 | Chris Cannam | |
31 | 2 | Chris Cannam | Some institutions will have a central system already. How many? Which? Are they happy with it? Can the SSI guess at any of these figures for us? Would the existence of a working, if not ideal, internal facility make a group less likely to accept any other approach that we might propose? |
32 | 1 | Chris Cannam | |
33 | 2 | Chris Cannam | h3. We could encourage institutions to make use of existing external facilities |
34 | 1 | Chris Cannam | |
35 | 2 | Chris Cannam | Researchers are often familiar with services like Google Code, SourceForge, GitHub etc already, and in some cases may use them even for hosting code that is not really supposed to be published ("yet") if they have a need to share it with one or more individuals at other institutions. If they are comfortable with doing that, why not encourage it -- since it also promotes open publication and has little or no maintenance cost? |
36 | 1 | Chris Cannam | |
37 | 2 | Chris Cannam | The big problem is that it doesn't address private hosting, for projects that are "not yet" ready for publication. It may (perhaps) be attractive to be able to persuade groups that their code should all be public from the start, but it's not very realistic, and in any case it's probably not wise to mix up a technical solution to a practical problem (use of version control during development) with promotion of a philosophical position (code should be published) during advocacy. |
38 | 1 | Chris Cannam | |
39 | 2 | Chris Cannam | Also, keeping track of projects in these external facilities is hard -- both for prospective user/reusers who want to find stuff, and for institutions who want to keep track of the work that their researchers are producing. |
40 | 1 | Chris Cannam | |
41 | 2 | Chris Cannam | All this said, these services work -- we don't want to find ourselves proposing methods that will be less attractive to motivated researchers. |
42 | 2 | Chris Cannam | |
43 | 2 | Chris Cannam | h3. We could provide a dedicated facility |
44 | 2 | Chris Cannam | |
45 | 2 | Chris Cannam | We could provide a new facility that provides private hosting and access control, so that in theory institutions can treat it as an internal facility with the ability to "promote" their projects to public status when desired. |