Version Control - Just Do It

If you do any kind of programming that extends beyond one-liners, you need to version control your work...even if you are a lone wolf weekend warrior, like myself. Why? At least three reasons:
  • Safety. Version control backs up your code, plain and simple. Many options of course for this, but consider...
  • Time travel. A backup of your code files is one thing. A backup of your work is another. Let's take the 3-month old, 500 line script your boss wants you to run on new data. You try it on the new data, but it borks. You investigate - you have a whole new format to deal with. So you set to work making your script deal with it. You run into a snag and your script is broken. It's 3 o'clock and you're still working on it. Again your boss calls, "Hey, can you rerun last quarter's data again? I need it by COB." If only you could go back in time, get the original version that worked, you'd have it done in 30min. If your work were version controlled, this is exactly what you can do.
  • Personal and professional development. When you learn version control, you take a first step into the world of open source, team software development. If you are truly interested in becoming a better developer, even as a hobbyist or an occasional script writer, this has to be your goal. If you as a manager or a business analyst want to understand how software engineers think and act, this has to be your goal. Beyond learning to code, you want to learn how coders work together to produce value, and this is hard unless you experience it yourself.
Modern version control systems are very sophisticated, and the learning curve can be daunting. Being an individual coder can actually make it easier to break in, since many features that are designed for teamwork (conflict resolution, merging) are not necessary to start with. I use Subversion; we use it at work, and BioPerl was using it when I started my open source adventure. The examples below will use it. Git and Mercurial are extremely popular, and better for many reasons. But again, this is about getting started with something.

The Server.

Administering a version control server also has its complexities and headaches. Fortunately in the Google era, this doesn't have to be a blocker. If you have a Google account, you can set up a VC project on http://code.google.com. It isn't hard to start a basic Subversion server on your own machine, for example, but a public service has the advantage of being accesible to others, once you're ready to interact.

When signed in to Google, create a new Google Code project. You may select one of Git, Mercurial, or Subversion as your server. If you choose Subversion, Google Code will set up the standard Subversion repository directories 'branches', 'tags', and 'trunk'. From the tabs on the homepage, select 'Source', and 'Checkout' from the tab menu. There you'll see a (e.g.) Subversion client (called 'svn') command that enables you to download a working copy of your repository's code from its URL.

(To be clear, I'm not plugging Google, I'm plugging "just get started", and this is as easy a way as I've found. You can export to anywhere and even convert between VC applications later.)

The Client.

You need to set up the VC client on your local machine. Un*x and Windows are both supported by all the players. svn is the Subversion client. (To interact with Google repos, you want a client built with https support.)

The Drill.

  • Import your code into the server.
  • Go to the directory where your code is and do the following:
    $ cd myscripts/500line
    $ svn import  -m 'my import' \\
       https://notworkrelated.googlecode.com/svn/trunk . \\
       --username you@gmail.com
    
    The '-m' option is a comment that will be added to your commit log. In general, every change you commit will require a comment. This is a Good Thing.
  • Check out a working version of your code.
  • In Subversion, importing your existing code from a directory doesn't bring that directory under version control. You need to formally check out your code into a new place.
    $ cd ..
    $ svn checkout https://notworkrelated.googlecode.com/svn/trunk \\
       notworkrelated
    
    Your code is now checked out into the 'working directory' myscripts/notworkrelated.
  • Work on your code.
  • Go to your working directory, and work. Get rid of the original directory-- move it, tar it up and delete, whatever.
    Changes to any files in your original import will now be monitored by version control. If you need to add new files, you need to tell VC about them. In Subversion, use the 'add' command:
    $ touch empty.file
    $ svn add empty.file
    
    Add a whole directory structure, same command:
    $ mv ../otherstuff/otherscripts .
    $ svn add otherscripts
    
    If you need to delete a file or directory, use the 'del' = 'rm' command.
    $ svn rm old.pl~
    
    which will delete the file and schedule it for 'deletion' in VC (i.e., the file will not appear in the latest version of your code for check out. You can get it back if you need it later.)

    You do need to inform VC when you make other changes to your directory structure:
    $ svn mv -p otherscripts aux/otherscripts
    
    Here, svn mv creates the new parent directory 'aux' and the leaf 'otherscripts'. In your working directory, you have moved 'otherscripts' and its contents to 'aux/otherscripts'. You have also scheduled './otherscripts' and its contents are for deletion in the repos, and 'aux' and all of its new contents for addition. 'Scheduled' again means these events will happen the next time you...
  • Commit your code.
  • Send all your changes to the repository by issuing
    $ svn commit -m 'You must add a comment...get used to it'
    
Now, work, commit, work, commit, ad infinitum. Commit early and often. Every so often, see how much work you've done:
$ svn log
------------------------------------------------------------------------
r226 | maj | 2013-08-30 23:28:10 -0400 (Fri, 30 Aug 2013) | 1 line

working on exception formatting
------------------------------------------------------------------------
r225 | maj | 2013-08-27 22:39:19 -0400 (Tue, 27 Aug 2013) | 1 line

propsets
------------------------------------------------------------------------
r224 | maj | 2013-08-27 22:21:06 -0400 (Tue, 27 Aug 2013) | 1 line

added spatyper lib and t; propset on everything
------------------------------------------------------------------------
r223 | maj | 2013-08-25 22:47:36 -0400 (Sun, 25 Aug 2013) | 1 line

adding Mojo tests
------------------------------------------------------------------------
r222 | maj | 2013-08-23 22:36:13 -0400 (Fri, 23 Aug 2013) | 1 line

inspect template is working
It's a good feeling.

Version control is a discipline; you need to learn it, get comfortable with it, and then always use it. This post is about getting there from zero. Of course, VC becomes useful when you need to recover old work, work on different versions of the same code (the old version that works, say, and the version with new features that is currently broken), or when you want to work together with others. For these items, see the many excellent, in depth tutorials out there. For Subversion, I like the good ol' Red Bean book. For Git, try Understanding Git Conceptually.

Leave a comment

About Mark Jensen

user-pic Getting there from here. CPANID: MAJENSEN