Learning Git the Hard Way

I’m a long time Subversion veteran with very little DVCS experience – I’ve used Mercurial for personal projects for a longer while now, but so far with very little need for the features where DVCS systems are radically different from centralized systems. It’s mostly been hg commit, hg push – using Mercurial rather like a centralized system.

So why suddenly jump to Git?

The project I’m currently working at is hosted in a Subversion repository. As it happens, I needed to try out some rather sweeping changes, and being a Subversion veteran, I really didn’t want to have to struggle through the Subversion merge process if they worked out. But Git has support for working with Subversion, so I thought I’d give it a shot.

A rocky start

My first attempt at cloning the Subversion repository started out like this, with TortoiseGit:

TortoiseGit showing an SSH password prompt

Trouble is, the repository is hosted on an Apache server, not through a SSH tunnel. Hitting cancel canceled the entire operation. So I clicked on OK in the vague hope that it’d ask if I wanted HTTP instead. The result:

TortoiseGit showing a perl.exe crash and the text "success" in bold letters

Great success! No, wait…

So yeah, not exactly stellar. But on occasion I can be annoyingly persistent, so I figured I’ll use the command line version instead. And after perusing the Git manual for a while, I successfully cloned the SVN repository.

Lesson 1: neither Revert nor Reset does what I expected

Coming from Subversion, I was used to the idea that in order to un-modify a modified file, the thing to do is to say “svn revert filename”. I had read enough about Git to know that wasn’t the right command – in fact, the manual on revert says just so:

Note: git revert is used to record some new commits to reverse the effect of some earlier commits (often only a faulty one).

Right! OK. So what about this Reset thing then?

git reset [–<mode>] [<commit>] –hard

Matches the working tree and index to that of the tree being switched to. Any changes to tracked files in the working tree since <commit> are lost.

Being the astute reader that I am, I completely failed to notice the significance of that last sentence there. I googled for some example usages, and for a moment, thought that the thing to do would be to git reset –hard HEAD^.

(Those of you who know Git: I can see you cringe. Please stop doing that.)

See, HEAD^ is not some obscure way of saying “the last committed version”. It’s an obscure way of saying “the version just before the last committed one”.

So yeah, I just removed the last committed version from my timeline.

Lesson 2: Reset still doesn’t do what I expected

Having convinced myself that I just threw my latest bits of work into the bit bucket, I quickly located my last compiled version – I knew it still had the changes I had made. I threw the assembly into Reflector, decompiled it, copied my changes back and then cleaned up the bits Reflector didn’t quite get right in the decompilation. Time spent: a few minutes. Anxiety level: through the roof.

Having this newfound wisdom about the destructiveness of reset, I decided to tweet about it. And in a matter of moments I received this reply:

@rytmis Solid advice, hopefully you didn’t lose too much? git reflog to the rescue.

Who the what now?

So as it turns out, “Git tries very hard to not lose your data for you”. Even when you tell Git to reset the status of your branch to a given commit, it doesn’t yet mean that commit is gone. And true enough, after a hard reset, running “git reflog” still shows that the commit exists. Saying “git reset –hard 5bcde1b” (where 5bcde1b is the identifier for the “lost” commit) undoes the damage.

Of course, by then I was too exhausted to try that route. Smile

Lesson 3: conflict resolution doesn’t work the way I expected

The first time a conflict occurred, I got really confused. Because, you see, I issued a “git svn rebase” expecting it to work like “svn update”. And for a while it worked the way I wanted it to work. But then my first conflict happened.

The key difference with a decentralized system is, of course, that both participants may have made multiple commits. This means that conflict resolution can’t happen quite like it does with centralized systems.

When I do a “git svn rebase”, what happens is roughly that Git remembers where I was when I last rebased. It rewinds my master branch to that state and then applies the new, incoming changes. So far, so good. Now, my changes were based on an earlier commit, so they have to be re-based on the new ones in order for there to be a consistent timeline. So Git begins to apply my recorded commits on top of the new base revision. If I get lucky, nothing special has to be done. If not, it’s conflict resolution time.

And here comes the really confusing part.

I may end up resolving a conflict between the latest revision from the remote and a local revision that’s several commits in my past. That is to say, the conflicting file will not contain my latest changes.

This really freaked me out at first.

With trembling hands I resolved my first conflicts in a way that seemed to make some kind of sense and continued with the rebase. I gave a sigh of relief when I noticed that afterwards, all my stuff was still safe. I repeated this cycle a few times before I began to grok what was going on. Of course the conflict resolution happens in “my past”. Because it has to be done at the rebase point.

Lesson 4: merges don’t work the way I expected

Another Subversion thing I had grown used to was how branches got reintegrated. You’d merge trunk changes into the branch, then merge the branch back.

Doing that with Git was a really bad idea. Especially given how at the time I had no idea how to undo the master –> branch merge I had done.

Remember the rebase process I described? Rebasing the branch on top of the latest master and then merging the branch back was way less painful.

Conclusion: Why bother?

Before this, I had tried to understand Git, and failed miserably. I suspected this would be painful, and indeed it was. And I’ve just begun to scratch the surface of Git. So why would I voluntarily do something like this when I could have just used Subversion?

Well, there’s the whole “get out of your comfort zone” aspect of things. And then there’s the fact that yeah, branch merges really are less painful.

But more to the point, collaborating with systems like this is a game changer. I know, I’m late to the game and it changed already, but damn. Thanks to Git, contributing to Dotless has been my best open source experience so far.

Oh, and by the time it became necessary for me to do my first Mercurial merge, I came prepared. Winking smile