One Thing I Love About Git

If you've been using Subversion or (shudder) CVS, you only have the briefest glimmerings of what source control is about. I don't really like having to dig too deeply into tools that I use. I want them to be easy, but I dig when something's hard.

On thing which frustrated me about Subversion is that fact that, as mentioned, I don't think about some things. More than once I've quickly hacked up a change to a module, switched other modules to use that module and do a quick svn rm and svn add.

Oops. I just lost my version history. Damn it.

Not with git. It figures it out for me. My Veure project uses DBIx::Class::Schema::Loader because I don't want to think about building my schema classes. The 0.5003 version is fantastic. It does a better job of naming relationships and the DBIx::Class::Schema::Loader::DBI::Pg support is fantastic.

But what happens if you rename a table? I've done this more than once on my new project, continually working to ensure that I keep things clean and consistent. Unfortunately, that means that a renaming the foo_bar table to foo means that a different schema class is created, separate from the old one. Naturally, I don't think about this and I do a quick git rm and git add.

And git sees what I did and it realizes that the file has been renamed and I don't lose my history.

From what I read (I could be wrong), git actually uses heuristics to determine whether or not a file has been renamed, but so far it's worked flawlessly for me. I am still using Subversion at work (I've not felt comfortable checking out the git-svn work, but that's me being silly), and it's just painful. I doubt I'll ever use Subversion for a personal project again.

Update: came into work this morning and was asked to merge one branch into another. It was conflict hell and I was having trouble figuring out all of the changes. A colleague, using git-svn, merged the two branches cleanly, no problem. I think it's time for me to learn git-svn now.

12 Comments

Just wanted to say, I find almost every one of your posts fascinating and educational. Keep up the great work.

Yes, git uses an heuristic to detect renames.

It compares the files and calculates the percentage of common lines.

If the percentage is above X, then this is a rename. You can see the similarity between two files in a extended diff header called similarity. Try a git diff -M between two revisions where you renamed some files and look for it.

I don't know the value of X. I looked for it in the docs and I cannot find it. I know that X is not 100%. For example, if you rename a Perl package, even with the changed package Name line, git will still detect the rename.

BTW, why git works this way is explained in a very old message in the git mailing list archive. It is said [1] that its the most important message in the git history:

http://article.gmane.org/gmane.comp.version-control.git/217

As for git-svn, its a must :), just make sure you get decent Perl svn bindings.

It's important to realise that, unlike svn, git doesn't even store renames in the repository. It doesn't version files, it versions whole trees, and any 'find me the last revision to change this file' logic is heuristic, based on similarity. This means, apart from anything else, that git is quite capable of coping with 'rename a file, make a copy, change both in different ways' in a single revision, which svn certainly isn't (when I was still using svk I managed to break the repo several times by forgetting to commit a rename before a change, but I think that may have been a svk rather than a svn bug).

Although one should be careful about that. I still prefer not to make sweeping changes to a file in the same commit as a rename. Small edits like changing package lines are OK, but for very much more than that I prefer to rename and edit in separate commits, to avoid straining the heuristic.

This is kind of belated, but what the hell.

I always like posts on Git vs. others, because I don't have the experience with various SCMs that other people have, and I can learn about them from their experience, so I appreciate that.

Regarding git-svn, that's one thing I can contribute about. Me and a colleague been using it at $work with our rather-big-svn-repo, and it's pure heaven. Basically the difference for the major commands is:

Instead of "svn up", you do "git svn rebase". You cannot "svn rebase" if you have stuff waiting to commit.

Instead of "svn commit", you do all your commits regularly with "git commit" and when you're done you run "git svn dcommit". That just uploads your commits to the subversion reop (the way "svn commit" does).

Basic work cycle is:

git svn clone --username user svn://.../ (you will be prompted for the password unless it's saved by the subversion client) (work work work) git commit, git commit, git commit git svn rebase git svn dcommit

It's pretty easy once you just try it.

I'm really happy to hear it :)

@Aristotle: Oh, I agree. I'm talking more about the small changes needed to make something compile, or about splitting a file in two without losing history for either copy.

Please change your background so that its possible to read the blog after the second paragraph and the comments. You'll get a lot more responses!

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.