Learning Git the Hard Way

I’m a long time Subversion veteran with very little DVCS experience – I’ve used Mercurial for personal projects for a longer while now, but so far with very little need for the features where DVCS systems are radically different from centralized systems. It’s mostly been hg commit, hg push – using Mercurial rather like a centralized system.

So why suddenly jump to Git?

The project I’m currently working at is hosted in a Subversion repository. As it happens, I needed to try out some rather sweeping changes, and being a Subversion veteran, I really didn’t want to have to struggle through the Subversion merge process if they worked out. But Git has support for working with Subversion, so I thought I’d give it a shot.

A rocky start

My first attempt at cloning the Subversion repository started out like this, with TortoiseGit:

TortoiseGit showing an SSH password prompt

Trouble is, the repository is hosted on an Apache server, not through a SSH tunnel. Hitting cancel canceled the entire operation. So I clicked on OK in the vague hope that it’d ask if I wanted HTTP instead. The result:

TortoiseGit showing a perl.exe crash and the text "success" in bold letters

Great success! No, wait…

So yeah, not exactly stellar. But on occasion I can be annoyingly persistent, so I figured I’ll use the command line version instead. And after perusing the Git manual for a while, I successfully cloned the SVN repository.

Lesson 1: neither Revert nor Reset does what I expected

Coming from Subversion, I was used to the idea that in order to un-modify a modified file, the thing to do is to say “svn revert filename”. I had read enough about Git to know that wasn’t the right command – in fact, the manual on revert says just so:

Note: git revert is used to record some new commits to reverse the effect of some earlier commits (often only a faulty one).

Right! OK. So what about this Reset thing then?

git reset [–<mode>] [<commit>] –hard

Matches the working tree and index to that of the tree being switched to. Any changes to tracked files in the working tree since <commit> are lost.

Being the astute reader that I am, I completely failed to notice the significance of that last sentence there. I googled for some example usages, and for a moment, thought that the thing to do would be to git reset –hard HEAD^.

(Those of you who know Git: I can see you cringe. Please stop doing that.)

See, HEAD^ is not some obscure way of saying “the last committed version”. It’s an obscure way of saying “the version just before the last committed one”.

So yeah, I just removed the last committed version from my timeline.

Lesson 2: Reset still doesn’t do what I expected

Having convinced myself that I just threw my latest bits of work into the bit bucket, I quickly located my last compiled version – I knew it still had the changes I had made. I threw the assembly into Reflector, decompiled it, copied my changes back and then cleaned up the bits Reflector didn’t quite get right in the decompilation. Time spent: a few minutes. Anxiety level: through the roof.

Having this newfound wisdom about the destructiveness of reset, I decided to tweet about it. And in a matter of moments I received this reply:

@rytmis Solid advice, hopefully you didn’t lose too much? git reflog to the rescue.

Who the what now?

So as it turns out, “Git tries very hard to not lose your data for you”. Even when you tell Git to reset the status of your branch to a given commit, it doesn’t yet mean that commit is gone. And true enough, after a hard reset, running “git reflog” still shows that the commit exists. Saying “git reset –hard 5bcde1b” (where 5bcde1b is the identifier for the “lost” commit) undoes the damage.

Of course, by then I was too exhausted to try that route. Smile

Lesson 3: conflict resolution doesn’t work the way I expected

The first time a conflict occurred, I got really confused. Because, you see, I issued a “git svn rebase” expecting it to work like “svn update”. And for a while it worked the way I wanted it to work. But then my first conflict happened.

The key difference with a decentralized system is, of course, that both participants may have made multiple commits. This means that conflict resolution can’t happen quite like it does with centralized systems.

When I do a “git svn rebase”, what happens is roughly that Git remembers where I was when I last rebased. It rewinds my master branch to that state and then applies the new, incoming changes. So far, so good. Now, my changes were based on an earlier commit, so they have to be re-based on the new ones in order for there to be a consistent timeline. So Git begins to apply my recorded commits on top of the new base revision. If I get lucky, nothing special has to be done. If not, it’s conflict resolution time.

And here comes the really confusing part.

I may end up resolving a conflict between the latest revision from the remote and a local revision that’s several commits in my past. That is to say, the conflicting file will not contain my latest changes.

This really freaked me out at first.

With trembling hands I resolved my first conflicts in a way that seemed to make some kind of sense and continued with the rebase. I gave a sigh of relief when I noticed that afterwards, all my stuff was still safe. I repeated this cycle a few times before I began to grok what was going on. Of course the conflict resolution happens in “my past”. Because it has to be done at the rebase point.

Lesson 4: merges don’t work the way I expected

Another Subversion thing I had grown used to was how branches got reintegrated. You’d merge trunk changes into the branch, then merge the branch back.

Doing that with Git was a really bad idea. Especially given how at the time I had no idea how to undo the master –> branch merge I had done.

Remember the rebase process I described? Rebasing the branch on top of the latest master and then merging the branch back was way less painful.

Conclusion: Why bother?

Before this, I had tried to understand Git, and failed miserably. I suspected this would be painful, and indeed it was. And I’ve just begun to scratch the surface of Git. So why would I voluntarily do something like this when I could have just used Subversion?

Well, there’s the whole “get out of your comfort zone” aspect of things. And then there’s the fact that yeah, branch merges really are less painful.

But more to the point, collaborating with systems like this is a game changer. I know, I’m late to the game and it changed already, but damn. Thanks to Git, contributing to Dotless has been my best open source experience so far.

Oh, and by the time it became necessary for me to do my first Mercurial merge, I came prepared. Winking smile

Testing instances of anonymous types using the ‘dynamic’ keyword

Recently I’ve been writing a lot of tests that exercise ASP.NET MVC controllers. Sometimes those controllers return JSON data, and the natural way to express that is with anonymous types – the syntax and structure match JSON very well. However, if I suddenly wish to assert on those objects, things get a bit tricky: there’s no statically typed way to access the properties.

JsonResult has a property called Data which is typed as an Object. I figured if I’d cast that as a dynamic and then use runtime binding, I’d be set. So I wrote a bit of test code:

public void Returns_error_when_list_is_not_found() {
    var controller = new HomeController();
    var result = (JsonResult) controller.AddItemToList(“item”);
    dynamic resultData = result.Data;
    Assert.AreEqual(“Error”, resultData.Status);
}

and follow up with a bit of implementation code:

public ActionResult AddItemToList(string item) {
    return
 new JsonResult {Data = new {Status = “Fail”}};
}

(Note: the value of Status in the implementation code is intentionally different from the one I’m asserting against in the test – we want a red light first!)

Seems simple enough, right? So I hit “Run test” and was rather baffled: instead of seeing an assertion error I saw this:

Test result showing unexpected exception: Microsoft.CSharp.RuntimeBinder.RuntimeBinderException : 'object' does not contain a definition for 'Status'

OK, I thought, maybe I’m just looking at the wrong thing. I fired up the same test in the debugger and checked the contents of resultData. It looked like this:

Debugger clearly shows that the instance has a property called Status

So to be sure, the object actually was an instance of my anonymous type. So what’s up with the exception?

It turns out that anonymous types are always internal. Which makes sense, because there’s no sane way to represent the type at assembly or even method boundaries. However, since dynamic came along, there is a straightforward way to manipulate the objects beyond those boundaries if you just ship them across as plain Objects.

There are, of course, a couple of obvious solutions: one is to make the bits I want to manipulate statically typed. One is to futz around with reflection, but I try to keep that to a minimum. The one I chose for now, is to mark the assembly under test with

[assembly: InternalsVisibleTo(“TestProject”)]

… which does away with the problem, and now we get the expected error:

Test result showing the expected error

Another battle won, another C# compiler factoid learned.

Tests reported twice when using XUnit with TeamCity

Here’s a quickie for all you people running XUnit tests in a TeamCity build. TeamCity doesn’t directly support XUnit, so it takes a wee bit of effort to get things going. The way I decided to tackle the issue was to add the following bits to our test project build file:

<Target Name=CustomBuild DependsOnTargets=Build>
  <CallTarget Targets=Test Condition=$(TEAMCITY_PROJECT_NAME) != ” />
</Target> 
<
UsingTask AssemblyFile=..\packages\xunit.1.6.1\Tools\xunit.runner.msbuild.dll  TaskName=Xunit.Runner.MSBuild.xunit
/>
<Target Name=Test>
  <xunit Assembly=$(TargetPath) NUnitXml=..\..\TestReport.xml />
</Target>

The CustomBuild target depends on Build, so Build gets run in any case. This ensures that we always run the latest set of tests.

Then, if the build script detects TeamCity (by presence of the TEAMCITY_PROJECT_NAME variable), it runs the Test target, which outputs its results to TestReport.xml.

Having got this far, I added TestReport.xml to our TeamCity configuration, and things seemed to work nicely. Except that our tests got reported twice.

It took me a while to finally clue in to what was happening: TeamCity was already parsing the output of the XUnit test task, and having a separate test report was what caused the duplicates. This wasn’t immediately obvious to me, until we built a separate performance test bench and used console output to communicate its results to TeamCity (more on that in a future installment).

Long story short: TeamCity can already understand XUnit tests, it just doesn’t provide a way to run them.

ODP.NET application crash with DateTime parameters

I spent some time yesterday debugging a crashing console application – a proof of concept for a wrapper library I’m writing around an Oracle database. I had successfully read values from the DB – writing that code took all of five minutes – and then tried a MERGE statement.

(Side note: this is the first time I’ve written a merge statement. In fact, while I did know of MySQL’s “REPLACE INTO”, I had no idea that such a thing existed.)

I played around with the statement in TOAD for a while, to make sure it was correctly written, typed it into my editor, added parameter placeholders, added the arguments for the parameters and ran the application. It flashed on the screen and then disappeared.

Which was a bit weird, seeing as it ended with a Console.ReadLine() call. I re-ran the app just to make sure I didn’t accidentally press anything, and the same thing happened. Meanwhile, looking at the results on the TOAD side of things, the value was definitely updated.

I was a bit tired, so it took me a while to notice that the process exit code was 0x80000003 – nonzero, so some sort of error was being reported. Still, even asking the debugger to break on all Win32 errors yielded nothing. And between the statement being executed and the crash, ProcMon reported that all operations the application performed were successful.

I was too tired to make sense of it, so I went home and took a fresh set of eyes with me today. I began to narrow the issue down to a single parameter. At which point I began to get NRE’s instead of crashes. And the stack traces contained references to the special handling of DateTime parameters.

I was doing this:

command.Parameters.Add(new OracleParameter("curdate", DateTime.Now));

Figuring that DateTimes were a special case, I chose this constructor overload instead:

command.Parameters.Add(new OracleParameter("curdate", OracleDbType.Date,   DateTime.Now, ParameterDirection.Input));

After which I re-ran the app, and like magic, everything worked.

Now, realizing that Oracle has more than one way of representing the data contained in a System.DateTime, I understand how there might be a code path or two dedicated for the handling of DateTimes. What I don’t get is that it’s possible for the most common overload of the OracleParameter constructor to result in not an exception, but a full-on application crash.

Anyway, I hope this post saves someone else the trouble. Smile

Fixing broken tools

What sucks about open source is that when something breaks, there’s no guarantee that it will get fixed. What seriously rocks, though, is that you can do it yourself.

As an example of the latter, we’re using dotless for a project at work. I didn’t do much with it, apart from setting up the HttpHandler and letting our frontend developers loose on it. After a few days, though, they ran into an issue: editing an imported less file would not cause the root file to be evicted from the cache.

I figured this was an omission in the caching mechanism, and downloaded the sources from GitHub. When I finally got everything right so that debugging was an option, something weird happened – the path parsing seemed to break, and our includes no longer worked the way they used to.

I dove into the issue, thinking that I’d have to fix it in order to be able to debug the cache issue. When I finally understood enough of the codebase to write the correct tests and make them pass, I suddenly noticed that cache invalidation worked again.

Weird.

Except, not so much when I finally figured it out. The path parsing bug led to imports like “/foo/bar” resolving to “foo/bar”. That in itself was bad enough, mind you, but since the cache code passed the same paths to the CacheDependency constructor, the net result was a cache dependency to a nonexistent file!

The docs for CacheDependency state the following:

“If the directory or file specified in the filename parameter is not found in the file system, it will be treated as missing. If the directory or file is missing when the object with the dependency is added to the Cache, the cached object will be removed from the Cache when the directory or file is created.”

That’s great, but in this particular case, I never want to watch for a file that might get created, so I would have been better served with an error. Smile

Anyway, after this little excursion in yak shaving it made sense that fixing path resolution also fixed the cache.

I contributed my changes as a fork on GitHub in the hopes that it will benefit others as well. Also, it would be nice if my changes were merged upstream, so I wouldn’t have to maintain my own fork. Smile with tongue out 

(I’m fairly sure I botched the Git operations somehow, though – I get that Git is awesome, but boy does it come with a learning curve.)

Anyhow, the important thing is that now our frontend developers don’t have to work with broken tools.

Including indirect, optional dependencies in builds

So I’m writing an application that uses NHibernate as its data access layer. I’m also writing a bunch of integration tasks, and a console driver for them. I’ve written the tests, and I’m ready to fire up the real thing from the console.

The first thing that happens is that my application dies. “Could not load file or assembly NHibernate.ByteCode.Castle”.

Well, bugger.

The weirdness starts when I look at my project’s dependencies. It doesn’t depend directly on NHibernate either, but the assembly gets copied to the output folder. So do most of NHibernate’s dependencies. But not this one. Why is that?

NHibernate comes with a set of byte code providers. While this may seem overkill, I get that people may already be using such a library, and forcing consumers to include yet another one is a no-no. So the dependency to a byte code provider is not a hard dependency. This means that in order to include it in our Visual Studio builds, we need to perform some trickery.

Fortunately, the trickery is rather simple. First, reference the chosen byte code provider in the project that has the NHibernate reference. At this point you may feel compelled to already test the results – don’t bother, they’ll be equally disappointing. Since the reference isn’t actually used anywhere, the build process will skip it.

Instead, what you need to do is add a compile-time reference – that is, directly use a class included in the assembly. Here’s what I did:

    #pragma warning disable 169
    // This reference ensures that we always get NHibernate.ByteCode.Castle.dll
    // as an indirect reference in other projects.
    private static readonly Type proxyFactory = typeof (NHibernate.ByteCode.Castle.ProxyFactory);
    #pragma warning restore 169

(The pragma does away with the compiler warning about an unused field.)

Of course, you can always accomplish the same thing with build scripts, this felt like a workable compromise. As a bonus, I don’t have to worry about this every time I reference this project.

My Biggest Career Move Yet

What a summer it has been. I took one of the most interesting vacation trips I’ve ever had – a three-week InterRail romp through Europe. We met interesting people, spent approximately 4 000 km on trains and drank a lot of beer. I don’t think I’ve spent that long without really thinking about work since I got started.

Getting really relaxed and out of work-mode was a really good idea. You see, before we left, I had made some plans, and I’ve been acting on the plans since I came back.

I’m switching jobs again. But this time both the reasons and the new job are a bit different: I’m setting up shop with Sami Poimala, Jouni Heikniemi and Riikka Heikniemi. My very own company, something I’ve been dreaming about for years now.

Exciting times!

 

So what’s the plan?

Well, since you asked, the long-term plan looks a lot like this:

Pinky and The Brain

 

 

However, that’s going to take some time to achieve, so the near-term plan looks more like this:

Will code HTML for food

 

In all seriousness, though, when you put the right people together, good things tend to happen. And I can’t think of a more fitting group of people to accompany me.

Chad Fowler says that when you play music, you should always strive to surround yourself with players that are better than you.

I think I just did. Smile

I’d like to thank all my soon-to-be-ex-coworkers at Sininen Meteoriitti. It’s been a brief, but fun ride together, and I won’t forget any of you any time soon. If you miss me terribly, you can always have someone make a pot of bad coffee and listen to some Eduard Khil!

 

Stay tuned for the next installment.

Post Haste

Meh, so I screwed up with Live Writer, and instead of using an earlier post as a template, I overwrote the old post. Fortunately Google Cache found the old post, but I had to do some updating to fix links that didn’t quite work, and it’s possible that RSS subscribers have been seeing random weirdness. 🙁

PowerShell quickie: expanding the properties of a variable inside a string

Here’s something that gave me a bit of trouble: I was attempting to run a script to upgrade a bunch of SharePoint solutions, and I wrote it like this:

dir –Recurse –Filter *.wsp | %{ stsadm –o upgradesolution –name $_ –filename $_.FullName –allowgacdeployment –local }

Now, that actually works, but since I hadn’t tested it before, I figured I’d echo the command out first, so I tried this:

dir –Recurse –Filter *.wsp | %{ echo "stsadm –o upgradesolution –name $_ –filename $_.FullName –allowgacdeployment –local" }

Which is all well and good, except it echos out something like the following:

stsadm –o upgradesolution -name Foo.wsp -filename Foo.wsp.FullName -allowgacdeployment -local

The reason this happens is that the string expansion in PowerShell has no way of knowing that I meant to expand the property instead of printing out the literal characters. Having figured this out, the next thing I tried was:

echo "${_.FullPath}"

Which doesn’t work either. Crap. I’m not familiar enough with the PowerShell syntax to explain why that is, but after a moment of pondering I finally figured out something that does work – expressions:

dir –Recurse –Filter *.wsp | %{ echo "stsadm –o upgradesolution –name $_ –filename $($_.FullName) –allowgacdeployment –local" }

The $() syntax allows us to evaluate any arbitrary expression, which is then inserted into the string. Yay!

1 2 3 4 11