Plan for Software Carpentry 2012 Boot Camps on Version Control

About the session

The session is a two hour interactive live workshop, using EasyMercurial and the Mercurial command-line tool.

Its goal is to explain version control to researchers who have never used it before, or who want to understand it better.

Outline

The basic plan is:

  1. Presentation introduction to version control in general
  2. Long worked example in which basic topics of version control are worked through using EasyMercurial and then some more advanced topics are returned to using the command-line tool
  3. Closing remarks talking about other tools, other topics of interest etc

Worked example

Preliminary: Check that participants have EasyMercurial installed and working. All of the following exercises will be using EasyMercurial unless it says otherwise.

Part 1: Working by yourself

Topics: Initialising a repository, committing files, reading history, looking at diffs, reverting unwanted changes, going back in time to look at old versions

We will be working on a recipe for fish stew for a future recipe book.

Adding your first file

  • Make a new directory, create a text file fishstew.txt in it, start adding an ingredients list, save
  • Run up EasyMercurial; enter user name and email address
  • "Open" that directory, see fishstew.txt in untracked file list: that means the version control system is not going to keep track of any changes unless we tell it to
  • Add file: that tells the version control system to keep track of any future changes to it
  • Commit: that sets in stone our having added the file. A commit is a checkpoint
  • Supply a message, note that we now have some history

Changing things

  • Edit the file, change something, save it
  • Note that the file is now marked as modified. (We might also see a backup file ending ~ or .bak from that editor -- ignore it for now, we'll come back to it in a moment)
  • Each revision records the state of all files, not just one file: add another file, omelette.txt and add that
  • Commit change, note that we now have two revisions
  • Review the history and look at the diff

Managing history

  • The history is not just for information: we can go back to the previous version by updating to it...
  • ... and then a normal update gets us back to the latest version again

Let's say this version is the one that we're going to send off to our agent, to see whether they can sell it to a publisher (or whatever we do in these modern times).

  • Tag the current revision as v0.1 -- digression about sensible tag names on whiteboard?
  • We can now identify this version easily in the history
  • Make and commit another change, this one involving renaming a file
  • What if we make a change and decide we don't want to commit it? Edit something, then hit Revert

Ignoring something we don't want:

  • Go back to that backup file in My Work, add it to ignored list, commit

Digression: every action we're taking here corresponds to one command-line command: show hg log, hg diff, hg update etc

Now we have history, but we are still in big trouble if our computer fails. Thus...

Part 2: Working by yourself, but "with backups"

Topics: Push, clone, using an online repo hosting service

Pushing local repository to a remote one

  • Register an account on Bitbucket and create a new private repository
  • Look up its URL
  • In EasyMercurial, hit the Push button, enter URL, push to remote repo, check that the history is present and correct on the site

Synchronising new changes

  • Make another change locally, commit (perhaps do this more than once before pushing)
  • Push the change(s), and check the history on site again

Recovering from a disaster

  • Exit EasyMercurial. Delete the local repository / working copy folder completely!
  • Start EasyMercurial again, see that (sniff) the working copy is lost
  • Clone it again from Bitbucket and note that the history is all there

Digression: Note command-line usage again, hg push, hg pull, hg clone: show a sequence of clones, modifying the last one and pushing back along the chain?

Part 3: Introducing other developers

Topics: Conflicts, merges, pull, annotate

Pair up and, in each pair, decide whose Bitbucket repo you will be working on and whose we'll just leave for now. (We should pair the "instructor" with someone as well).

Remark: in real-world use, this could very well be the same person just using two different computers

Getting someone else's changes

  • The second person in the pair should then clone the repository from Bitbucket
  • They should then change something and commit it
  • First person pulls -- notes that there are no changes there (merely committing it doesn't put it on the server)
  • Second person pushes
  • First person pulls again -- the change is now in the history but not in the working copy
  • First person updates -- the change is now in the working copy!

Making alternative versions

  • Both people now make some edits: they should edit two different files and commit them
  • The second user should push their changes to the remote repo first
  • The first user then tries to push. They should get the "Push failed... The local repository may have been changed" message
  • Then the first user pulls instead. (Perhaps checks the Incoming list first?)
  • See that the history graph now shows two heads

Digression on sociological nature of conflict a la Greg if feeling expansive

Merging non-conflicting changes

  • Hit Merge, see that the merge happens straight away
  • Remark that this is the point at which you would now test the merged version
  • Commit, push, get collaborator to pull

Resolving conflicts

  • Get each person to edit the same file, in conflicting ways
  • Again, both users should try to push and the push should fail for the later one
  • That user pulls, hits Merge, get the merge window up
  • Do an "instructor-guided" merge
  • Commit, push, get collaborator to pull

Annotate ("blame")

  • Run annotate on the recipe file to see who changed what and when

Part 4: More sophisticated business at the command-line

hg archive: packaging from a tag

Having tagged the version (v0.1 or whatever we called it) that we're going to send off to the agent, now we need to pull out only that version and send it off, without the whole repository attached.

  • Open a terminal window
  • cd to the working copy
  • Run hg archive -r v0.1 book-v0.1.zip
  • Check book-v0.1.zip and make sure it contains (only) the correct files for revision 0.1

hg id: provenance when running experiments

  • Run hg id at command line and note that it shows the id of the current parent revision
  • We can do this from a script when running an experiment, or from a Python program

hg backout: undoing an earlier revision

  • Want to undo something we committed earlier? hg backout will prepare a changeset which is the "inverse" of some earlier commit. That might involve resolving a merge, if something else has changed in the mean time. You can then test it and commit.

hg bisect: finding the origin of bugs you can't see (analogy with hg annotate for finding bugs you can see)

  • Example based on the Python code from elsewhere in the workshop

Misc notes

Things not yet incorporated into the above: Copying, renaming, deleting files; Branching and merging amonst branches; Stuff that is different in other systems

Should we cover (named) branches and merges between them?

Against: perhaps a level of complication too far for a two-hour intro; Greg doesn't cover them in the Subversion version; lessons learned from Hg are not immediately applicable to git or Subversion because the branching methods are somewhat different.

For: they are much simpler to use in Mercurial than in Subversion!

Quiz

Based on http://software-carpentry.org/4_0/vc/quiz/

  • Why is it a good idea to use version control on your projects?
  • What is a version control repository?
  • Suppose you’ve created a new file on your computer and you want to start using version control for it. How do you go about doing this?
  • Jon is working on a version-controlled project with Ainsley and Tommy. He wakes up early one day, ready to do some work on the project. What is the first thing he should do?
  • Tommy and Jon have up-to-date local repositories, and are both editing a file that contains 10 lines of data. Jon makes a change to the fifth line, and commits and pushes his changes to the remote repository they're both using. Tommy makes a change to the first line of the file, commits his changes, and tries to push them. What will happen, and what should Tommy do next?
  • Ainsley and Jon are up-to-date, and are both editing the sixth line of a file. Jon commits and pushes his changes first. When Ainsley commits and tries to push, what will happen, and what should she do next?
  • How do you undo local changes to files that have not been committed?
  • Give the shell commands you would use to accomplish the following tasks:
    • Check out the repository located at http://example.com/repo into the directory /cygwin/home/repo
    • View the log of changes
    • Add the file “experiment.txt” to the repository
    • Commit the file to the repository.
    • Update the local copy to reflect any new changes in the remote repository

Post-presentation notes

Wrap-up of UCL event (first run for this material)

VersionControl.pptx - PowerPoint introduction (2012-04-26) 59.8 KB, downloaded 166 times Chris Cannam, 2012-04-26 05:18 PM

VersionControlT.pptx - More sensible PowerPoint introduction without quite so many words in it 66.2 KB, downloaded 165 times Chris Cannam, 2012-05-02 02:34 PM

VersionControlT.pptx - More sensible PowerPoint introduction without quite so many words in it (2012-05-14) 71 KB, downloaded 161 times Chris Cannam, 2012-05-14 08:58 AM

VersionControlT-2.pptx - Minor updates for Feb 2013 68.2 KB, downloaded 76 times Chris Cannam, 2013-02-05 04:45 PM

VersionControlT.pdf - PDF version of Feb 2013 666 KB, downloaded 143 times Chris Cannam, 2013-02-05 04:45 PM