ExecutionEngineException in ASP.NET MVC when mixing .NET Framework and .NET Standard assemblies

TL;DR: If you mix .NET Standard assemblies that depend on the System.Net.Http NuGet package into your .NET Framework ASP.NET MVC app, you are likely to encounter runtime crashes when your AppDomain unloads and reloads. The solution involves a binding redirect and a project build time setting.

This one was fun to figure out. I’m working on a project, where I really, really wanted to use Microsoft’s SpaServices package to enable server-side rendering for a Node frontend app. Fortunately, it only takes a little bit of glue to make them stick to an ASP.NET MVC 5 app, and you’re off to the races.

Unfortunately, it also comes with a nasty side effect of periodically crashing your app during development.

It took me a while to recognize the pattern, but after a dozen or so crashes, I finally figured out, that the exception always occurred after I had either edited the Web.config file or rebuilt the binaries — that is, whenever the AppDomain was unloaded and the app restarted.

The problem was initially triggered by ASP.NET Web Api’s EnsureInitialized call, but the culprit at the top of the stack was actually RuntimeAssembly.GetTypes() — so when I managed to trick Web Api into not triggering the problem, it was pushed forward to something Glimpse did. When I removed Glimpse, it moved to something that the ASP.NET MVC infrastructure did, and so on.

The actual problem is a combination of different issues: when running on the desktop framework, the System.Net.Http version you’re supposed to use is the one shipped in the framework. However, that version has some weird versioning quirks due to the fact that it isn’t just a .NET assembly, it’s also a Windows component. This leads to it having a version number that is smaller than the NuGet counterpart, which, in some edge cases, leads to the wrong assembly ending up in your process. I’m not sure where the memory corruption comes into play (and I’m not sure I want to know 😛), but fortunately the fix is simple.

You need to add a binding redirect for System.Net.Http and set ImplicitlyExpandNETStandardFacades to false in your project, as described in this GitHub comment.

Ah, System.Net.Http. Since it began shipping out-of-band, it’s just the gift that keeps on giving

I’m not bringing much new information to the table, but I hope this post at least helps someone else find the solution to the same issue without spending hours diagnosing it.

TypeScript Duplicate Identifiers when using npm link

Last night I ran into an issue with TypeScript compilation. I’m working on a frontend web project that uses a library we are developing in another repository. Right now, we’re simultaneously working on both repos, so writing a feature from end to end is a bit of a pain, if one has to first commit to the dependency repository, then pull from there to get the latest version for the app itself. Fortunately, there’s a solution for that: npm link / yarn link. This feature allows you to substitute a local version for a dependency, using symbolic links (or NTFS junction points, if you’re Windows-inclined, like myself).

The problem arises when both your main app and your linked dependency have one or more same dependencies. For example, in our case, the shared dependency was react-intl. If the dependency is not linked, everything works as well as things usually work in js-land (🙄), but once the linking happens, things begin to break.

The core of the problem is that at some point, the TypeScript compiler will encounter two different, possibly identical versions of the same dependency declaration, at which point it will give up and produce a whole mess of errors.

The fix is simple enough, and pretty well documented already: at the top level, where you run the compilation, you add a path mapping, like so:

    "baseUrl": "./src",
    "paths": {
      "*": [
        "../node_modules/@types/*",
        "*"
      ]
    }

What this means (as I understand it) is, for every non-cwd-relative module resolution the TypeScript compiler attempts, it will try the paths specified in the array. This means that when encountering the dependency in the dependent module, it will look for it in the parent module’s node_modules/@types directory, and since it finds it there, it will look no further. This coalesces the duplicate dependencies into one instance, and voila, problem solved. Path mappings can be more specific, too, if you need to target a single problematic package. In this case, I wanted this behavior for all deps, so I went with the easiest route.

Large number of connections when using MongoDB C# driver 2.x

Long time, no posts and whatnot. To break the ice, here’s something that’s fairly well documented, but not necessarily that obvious, that I learned today.

TL;DR: keep your MongoClients as singletons, like the documentation damn well tells you to do. A more detailed explanation follows.

Yesterday, I deployed an app to production. It was a fairly major deployment in that it involved upgrading the infrastructure: I had updated the app to use the MongoDB C# driver 2.x series, and also upgraded the actual production database from MongoDB 2.x to the latest 3.6. The app was fairly well tested, but of course, the one thing that very rarely gets tested for is production load.

I had performed the driver upgrade as a fairly mechanical search-and-replace type exercise, and while most operations were easy to replace, the one thing that was missing was the ability to disconnect from the server. I hit the documentation, and found out that it says the following:

It is recommended to store a MongoClient instance in a global place, either as a static variable or in an IoC container with a singleton lifetime.

However, multiple MongoClient instances created with the same settings will utilize the same connection pools underneath.

Reading a bit more on the topic confirmed that there was no need to disconnect the client, so all was well in the world. What I didn’t do, however, was to register the client as a Singleton — because I wanted to keep the changes to a minimum, and the docs stated that it wasn’t required, even if it was recommended.

In the next changeset, I added some telemetry: I wanted to log the duration of MongoDB operations, so I could use our monitoring to see if our Mongo operations got very slow. I found a post titled Monitoring MongoDB with Application Insights and followed its instructions. And here’s where things went wonky.

See, the article’s example uses a lambda function as the cluster configurator, which isn’t bad as such, but. The documentation I quoted above? In full context, it looks like this:

However, multiple MongoClient instances created with the same settings will utilize the same connection pools underneath. Unfortunately, certain types of settings are not able to be compared for equality. For instance, the ClusterConfigurator property is a delegate and only its address is known for comparison. If you wish to construct multiple MongoClients, ensure that your delegates are all using the same address if the intent is to share connection pools.

Combine that with the fact that my MongoClient registration was per-request, and ta-da, I’ve effectively disabled connection pooling with no possibility to dispose of the connections. So perhaps it wasn’t that surprising to see MongoDB log the following:

2018-01-18T08:17:35.340+0200 I NETWORK [listener] connection accepted from 127.0.0.1:52669 #4457 (4455 connections now open)

Yeah.

Fortunately, the fix was rather simple: move the client registration to be a singleton, and that’s it.

Mind you, there’s nothing wrong with the instructions in the post I linked to. Had I treated the client instance like the docs suggested, I wouldn’t have had any problems.

Moral of the story? If the documentation recommends something, it’s probably a good idea to do it, I guess.

ASP.NET Core and Assembly Binding Redirects

During the last year, I’ve been a part of launching two production sites that run on ASP.NET Core, and as a company, we’ve had enough dealings with the budding framework that we arranged a full day’s seminar on the topic.

Needless to say, using a framework in anger at this point of its development has led to all kinds of interesting discoveries, the kind that you typically only ever make on the bleeding edge.

Where have my assemblies gone?

One of the major changes in .NET Core compared to the full .NET Framework is that there is no more Global Assembly Cache. All assemblies – including most if not all of the runtime itself – will be shipped as NuGet packages, which means that the assembly loading story is a fairly major departure from the way things used to be. However, .NET Core is not always a viable platform: for instance, currently there is no production-ready server-side image processing capability since System.Drawing is not cross-platform*. Given that constraint, we’ve ended up deploying our production ASP.NET Core applications on the full .NET framework, and the full FX still has the GAC.

Currently, ASP.NET Core on the full FX loads assembly dependencies by hooking up AppDomain.AssemblyResolve to work its magic. When your code tries to interact with an assembly that is not yet loaded, the runtime looks for the assembly in your NuGet packages. However, there’s a key phrase in the documentation for the event: “Occurs when the resolution of an assembly fails.” This means that regular assembly binding rules are attempted first.

Typically, this would not be a problem. When you deploy your application, you deploy the NuGet dependencies, and the GAC only contains the framework’s assemblies. However, sometimes you will have a rogue application on your system that installs something to the GAC, and things may go a little pear-shaped.

DocumentDB deserialization woes

Consider this example: our app uses Azure DocumentDB as one of its data stores. The .NET DocumentDB client library uses JSON as its serialization format, and in particular, Newtonsoft.Json as its serialization library. One of the things you can do with that combination is specify that the serialized name of your property is different from the one declared in code, by annotating the property with JsonPropertyAttribute. Now, our app opted to use one of the latest builds of Newtonsoft.Json (7.x), and for the most part, everything worked beautifully. However, my development system had an installed app that, unbeknownst to me, registered an older version of Newtonsoft.Json into the GAC.

Unfortunately, the runtime assembly version of the GAC assembly matched the requirements of the DocumentDB client library exactly, so the runtime loaded that assembly for the DocumentDB client. The practical effect was that when the DocumentDB client (de)serialized objects, it never noticed the JsonPropertyAttribute that we were using. The net result? A single property in that class was never (de)serialized correctly.

It took me a while to figure out what was happening, but the key insight was to look at the loaded modules in the debugger and notice that indeed, we now had two copies of Newtonsoft.Json in memory: the version from the GAC and the version we were specifying as a dependency. Our own code was using the JsonPropertyAttribute from version 7.x whereas the older version of Newtonsoft.Json was looking for the JsonPropertyAttribute specified in that assembly. While the properties were identical in function, they were different in identity, so the attribute we were using was ignored entirely.

Wait, isn’t this a solved problem already?

If you’re a seasoned .NET developer, at this point you are probably thinking “binding redirects”. At least we were – but the question was, where to put them? Another major change in ASP.NET Core is that your application configuration is entirely decoupled from both the configuration of the runtime and the configuration of your web server. Which means that in a fresh out-of-the-box web application, you do have a web.config, but it is only used to configure the interaction between IIS and your application server, Kestrel.

Since Kestrel is running in a process outside IIS, it’s reasonable to expect that Web.config doesn’t affect the behavior of the runtime in that process. And indeed, it doesn’t. But the new configuration system doesn’t have a way to specify the configuration of the .NET runtime either. So what does that leave us?

After a little bit of to-and-fro with the ASP.NET Core team, the answer finally came up: the runtime configuration still exists, but its naming conventions are different from what we are used to. If you create a file called App.config (yes, even when it is a web application) and specify your binding redirects there, they will be picked up, and all is well in the world again.

The configuration file has the same schema as you would expect from a Web.config or a standalone executable’s App.config. The resulting file looks like this:

<?xml version=”1.0″ encoding=”utf-8″?>
<configuration>
<runtime>
<assemblyBinding xmlns=”urn:schemas-microsoft-com:asm.v1″>
<dependentAssembly>
<assemblyIdentity name=”Newtonsoft.Json” culture=”neutral” publicKeyToken=”30ad4fe6b2a6aeed” />
<bindingRedirect oldVersion=”0.0.0.0-4.5.0.0″ newVersion=”7.0.0.0″ />
</dependentAssembly>
</assemblyBinding>
</runtime>
</configuration>

Hope this helps anyone else encountering the same problem, however uncommon it may be!

(* My colleagues pointed out that I neglected to mention the fact that System.Drawing is not a production-ready server-side image processing solution either, given that it uses GDI+ which uses process-wide locks, and therefore essentially makes the image-processing parts of your app single-threaded.)

Dotless 1.4.4 is out

Idle hands do… open source?

During the last few weeks, I’ve done something I’ve wanted to do for a longer time, and stepped up my involvement in Free Software somewhat. It started out as kind of an accident: I was encountering an annoying bug that I attributed to our usage of dotless in one of our projects, and I went to the issue tracker looking for reports of a similar problem. Having looked at the tracker for a moment, I then checked the corresponding code and noted that, yes, dotless does in fact do the correct thing. Then I proceeded to look at my own code, and it took me all of five seconds to spot the obvious bug.

A bit embarrassing, sure, but not useless. Because while I was looking through the issue tracker, I noted that some of the issues were of a fairly simple nature — maybe even something I could fix? I remembered fixing a bug back in ’10, so I then went through the list of closed pull requests, and noted that I had contributed no less than five PRs.

During that weekend, I came down with the flu and skipped work. However, I used some of that downtime to work on dotless — given that I had no time constraints or expectations of efficiency, I could spend a moment here and another there to fix a bug or two. First, I ended up going for the low-hanging fruit. I ended up creating about a dozen pull requests — some with bug fixes, some with actual new features.

After giving things about a week to settle, I then asked the current maintainers if they might accept me as a core contributor, since they didn’t seem to have the time to process the pull requests. Not long after that Daniel granted me contributor access to the project, and off I went, merging the PRs in and cleaning up the issue tracker.

Sweet release

Of course, not everything went perfectly: I intended to release dotless 1.4.3 about a week after having merged the fixes in. And I did — except that I messed up the NuGet packaging so that the standalone dotless compiler was left out of the package. And instead of releasing 1.4.3.1 with the fixed package as I should have, I bumped up the version to 1.4.4. I expect that won’t be much of a problem for anyone, though, so I’m not feeling too bad. After all, I did fix a number of inconsistencies, crashers and things like Bootstrap not compiling when minified. So maybe I can forgive myself a bit of a blunder there. 🙂

What next?

The less.js guys are thinking about building a .NET wrapper around less.js. It’s an interesting idea, to be sure: that way, the .NET implementation would never need to play catch-up with the official version. However, I still believe there’s merit in having a “native” .NET implementation, so I’m going to keep at it for now.

For the next release, I’ve already got @import options, variable interpolation improvements, list arguments and improved mixin guards. Porting the less.js test cases to give me a rough idea of how far behind is a logical next step. I’d like to aim for feature parity for 1.5 — on the other hand, maybe more frequent releases with smaller, incremental improvements would better serve the project. At the very least, 1.5 should fully support Bootstrap and KendoUI.

A large slice of my professional history is in line-of-business software with user bases ranging in the dozens or hundreds. It’s exciting and a bit frightening to be taking responsibility of a project that has, over the course of years, been downloaded over 400 000 times from NuGet.org. Time to see if I’m up to the task!

NHibernate TimeoutException with Azure SQL

Recently, I spent nearly three full working days debugging the damndest thing: a simple NHibernate Linq query in an Azure test environment was timing out. Together with the SqlAzure client driver that does transient fault handling by retrying queries this resulted in a situation, where a specific page would never load, causing instead a huge spike in the database resource usage.

Of course, as it tends to be with these things, the same query against a local SQL Server database worked just fine.

Possibly the strangest part was that after obtaining the query through NHProf, I tried running the same query via SQL Server Management studio, and the mean execution time of the query was between 100ms and 200ms. Accordingly, I had a hell of a time believing that the issue was an inefficient query as such.

I even tried creating a raw ADO.NET query that had the same command text and parameter specifications… and it executed in under 200ms.

I was about to give up when I had the idea of running both the slow and the fast query against a local database with the SQL Profiler enabled, because while there was no discernible difference in execution time against the local database, perhaps I’d be able to make out some difference in the way the queries were executed.

At first, it looked like the queries were identical from the server’s perspective, too.

But then, I noticed the difference.

The slow query declared that the single parameter’s size was 4000 whereas the fast version said it was 1.

Realization began to dawn, and I re-ran my raw ADO.NET query against the Azure database, but with the parameter size set to 4000 — and wouldn’t you know it, the timeout manifested itself immediately.

My current hypothesis is that what’s going on underneath it all is that the database is waiting for a value for the parameter and it never shows up, which is what causes the timeout. Another strange factor is that the issue doesn’t reproduce with all Azure SQL servers.

All this is triggered by a change to NHibernate Core where the SqlClientDriver skips setting the parameter size to a meaningful value, and instead sets it to a default of 4000.

Fortunately, the workaround is simple: I extended the client driver code with a very specific special-case workaround that sets the parameter size:

It may eventually turn out that I’ll need to handle more than just the one-character parameter size, but for now, this seems to fix the issue.

Media Services at Global Azure BootCamp Finland

I was at the Global Windows Azure BootCamp in Espoo today, rambling about the coolness of the Microsoft media platform in general and Windows Azure Media Services in particular. The event was hosted at the awesome co-workspaces of AppCampus by Teemu Tapanila and Karl Ots — hats off for organizing the event, it was great fun!

I’m not sure if anyone actually tried them, but I put up some lab exercises on GitHub for playing around with Azure Media Services. If you’re looking for a starting point for working with Azure Media Services, go ahead and take a look. 🙂

The Anatomy of a Cloud Video Service — My TechDays 2013 talk

So, a few weeks back I was on stage at TechDays 2013 Finland. My topic for the day, titled “The Anatomy of a Cloud Video Service”, was about the Futudent “camera + software + cloud service” solution that I’ve been involved with for quite a while now. I intend to cover the associated technologies in more depth in blog form later, but for now, here’s the video of my presentation.

I spent my hour talking about what the client application does, how we handle video transcoding, what it was like to build the associated video sharing service and all the challenges associated with the entire story.

The talk is in Finnish, so obviously it’s only for a limited audience. Also note that for whatever reason, the video is set to forcibly start at 3:22 and you have to specifically click on “watch the entire video” at the timeline marker in order to get to the first few minutes.

Upcoming breaking changes in Windows Azure Active Directory preview

A moment ago Vittorio Bertocci wrote a post on some upcoming changes to the Developer Preview of WAAD. The changes are of the breaking sort, so if you’re actively using WAAD, this is something you’ll want to react to.

The WAAD MSDN forums have a more detailed announcement about the changes, but at a glance, here are the two key things I picked up on.

The service endpoint names are changing

Your (most likely automatically generated) Web.config settings say something like this now:

      <wsFederation passiveRedirectEnabled="true" 
					issuer="https://accounts.accesscontrol.windows.net/tenant-id/v2/wsfederation" 
					....
					requireHttps="false" />

After the change, that will have to change to:

      <wsFederation passiveRedirectEnabled="true" 
					issuer="https://login.windows.net/tenant-id/wsfed" 
					....
					requireHttps="false" />

The metadata and JWT endpoints are changing too, which may or may not affect you — but if you’re using any of them, you’ll probably know what to do anyway. 🙂 

The User Principal Name claim will no longer be included

A while back, the claim that actually names the user principal was changed from EmailAddress to UPN. Now things are changing again, and in the future, the naming claim type will be … name! Which means your web.config settings need to change from

<nameClaimType value="http://schemas.xmlsoap.org/ws/2005/05/identity/claims/upn" />

to

<nameClaimType value="http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name" />

That’s pretty much it. And of course, if you can’t get the settings right editing them by hand, you can always run the Visual Studio Wizard again

Hope this helps someone. 🙂

Blog move

Just a quick note: I’ve grown tired of Blogger regularly doing a hatchet job with my post markup and the limited options they give me for customizing the look and feel of the site, so I’ve moved to a new host.

I tried my best to not mess with the existing post URLs, but I’m afraid that at least the RSS feed urls will have changed.

1 2 3 14