Sleeper hit of the year

No, it’s not the new John Grisham book (there’s always a new John Grisham book), nor the dare-I-hope-this-one-doesn’t-suck Star Wars III film. No, this one is the book I just picked up at the local bookstore after watching it get raffled away at VSLive! a few weeks ago: Customizing the Microsoft .NET Framework Common Language Runtime, by Steven Pratschner. Whether you are a .NET developer or Java developer, you should read this.

One of the sleeper features of .NET 2.0 has always been the incredible degree of "hook points" they baked into the 2.0 CLR, some of which were needed by (for example) the Yukon team in order to bake the CLR into the database, much as Oracle and DB/2 did with the JVM a few years ago. (In fact, IBM has a version of DB/2, code-named "Stinger", that’s going to embed the CLR in much the same way as Yukon and the JVM-embedded databases do, as well.) SQL Server has always kept a tight grip around memory management, for example, and needed to exert tight control over how assemblies get loaded into the database, in order to support traditional database semantics. The baseline feature set of the CLR just wasn’t gonna cut it.

So the CLR team paused, sat down, and wrote a smashing set of COM interfaces (I know, I know, COM is dead, yadda yadda yadda, but the reality is that it will remain the underpining of the CLR forever, just as the underpining of the JVM is C++) that allow a CLR host–that is, any unmanaged code that wants to call CorBindToRuntimeEx and obtain an ICLRRuntimeHost*–to effectively take over parts of how the runtime functions, including (but not limited to):

    assembly loading: Don’t like the way the current assembly loading scheme works? Write your own
    failure policy: How the CLR handles failures, such as exceptions
    memory: ‘Nuff said
    threading: Ditto
    thread pool manager: Just about everybody I’ve taught asynchronous delegates to immediately asks, "Is there any way I can control the size of the system thread pool?"
    garbage collection: force GCs, collect statistics about GCs, and so on. No, this is not a replacement for finalizers (but it might help)
    CLR events: "information about various events happening in the CLR, such as the unloading of an application domain"

As a self-proclaimed "plumbing guy", I’m already smacking my lips.

But why do you care?

    You may have a system that turns out to want a wee bit more control over how thread-switches take place. (Games, for example, want some more precise control over quanta, I’m told.) Create a threading manager that uses cooperative fibers instead of raw Windows threads to do that.
    You may want to have something that just "listens" to the application and reports failures and/or other events to a management interface without requiring intrusive coding inside the app. Write a "host" process that kicks off the code in the usual way after establishing an "events host" that listents to the events in question, and potentially discards the old app domain and starts up a new one every 24 hours (if necessary).
    The classic one: you want to replicate the "-ms" option of the JVM, which tells the JVM to start with "x" amounts of memory to begin with, rather than have to go through a series of "I’m-out-time-to-ask-the-OS-for-memory" cycles that the CLR normally goes through when spinning up apps that consume, say, 512M or 1GB of heap. (I know a financial firm in San Francisco that wanted this two years ago.)

And so on, and so on, and so on.

Why should you care if you’re a Java guy? Because these kinds of hooks are necessary if Java is to keep up, number one, but also because these kinds of hooks would allow for commoditization of the JVM itself, creating a market for graduate students and entrepreneurs to create customized memory management algorithms, for example, that right now require those same entrepreneurs to create an entire JVM. The open-source world would have a field day, creating all sorts of vertical plug-ins at the JVM level that we could pick-and-choose, selecting whichever ones happen to fit our needs best–including a very-real, very-credible memory allocator that just keeps allocating until we run out of room in the heap (for short one-shot gotta-run-as-fast-as-frickin-possible batch jobs, for example).

You want to read this book. And in the meantime, I can’t wait ’til Rotor Whidbey ships, because this is the kinds of stuff I want to rip apart. 🙂

BTW: One very reasonable possibility for a custom host is a java-like launcher (clr.exe?) that allows you to pass some customization options on the command-line, just as java.exe does. We would use it solely in cases like -ms/-mx, to enable an otherwise normal .NET app to have some of its environment customized via the command-line rather than having to write our own managed host. Hmm…. On the surface, this wouldn’t represent too much work to toss off in a weekend….

One noticeable thing missing from the list above, though: JIT compilation, and I think I know why. (Because in the CLR, JIT compilation and bytecode verification are tightly interwoven and would be almost impossible to tease apart into a pluggable manner. Still, I’d love to see it, guys….)