Loading all of the things

One of the features of vNext most people don't know about is the ability to add custom project loaders. Though, it's not just that people doesn't know about the fact that you can do custom project loading, a lot of people probably don't even know what project loading in vNext is. So let's start by taking a look at how K works, and what happens under the hood when you run a K application.

Note: this will be highly technical, yet I will not go into every detail on how the K pipeline works. That would take too much time, the KRuntime would probably have changed by the time I'm done writing this post. Also, since K changes so much and so frequently, there is no guarantee that by the time you're reading this any of the information written here is still valid.

Bootstrapping

Let's start with a really simple application. It consists of one .cs file, and the project.json file in a folder named Test, and that's it. The project.json file looks like this:

{
    "dependencies": {},
    "configurations": {
        "net45": {}
    }
}

And the program.cs file looks like this:

using System;

namespace TestMeta  
{
    public class Program
    {
        public static void Main(string[] args)
        {
            Console.WriteLine("Hello World");
            Console.ReadLine();
        }
    }
}

All in all, it's a dead simple program. All it does is write Hello World, wait for the user to hit ENTER then it exits. To run this program we navigate with a console to the directory containing the project and run k run. k is a shell script (k.cmd on windows and k.sh on *nix) and forwards to call to klr.exe with some added arguments, which is a native application. klr in turn loads the runtime selected on the machine. This will (in our case) either be the Microsoft .NET framework installed on the machine, or Mono if this is run on a machine other than windows. The runtime runs some more bootstrapping, and eventually it ends up calling into Microsoft.Framework.Runtime. This is where the real fun starts, and most of the managed code that deals with running K applications reside.

The Microsoft.Framework.Runtime (hereby known as simply Runtime) creates some services, and a service-provider, and creates a few of what's called IAssemblyLoader. The IAssemblyLoader interface is a simple interface and looks like this:

[AssemblyNeutral]
public interface IAssemblyLoader  
{
    Assembly Load(string assemblyName);
}

It then hooks up the global assembly resolve event to make sure that it can properly intercept requests for assemblies. At this point, most of the configuration is completed.

After the bootstrapping is completed, the Runtime "loads" the current project (if not otherwise specified, this will be the project in the directory which you called k run). This triggers the different loaders to try and load the assembly one after another, until one of them succeeds (or the program crashes). Then, a class named Program is searched for, and a Main method is looked up. If the Main method is static, it's invoked as is, otherwise an instance of Program is created using the DI, and Main is invoked on the instance. At this point our program is run.

On a side-note, one of the things this model enables is the ability to get baked in support for async. If I were to change the Main method in my project to return a Task, it would simply just work.

Project loader

Among the built in loaders, there's one that is a lot more interesting than the rest, namely the project loader. The project loader is responsible for loading projects (obviously). What's not obvious though is that "loading" a project consists of a lot more than one would normally think of when thinking about assembly loading. "Loading" a project in K normally means doing compilation. At runtime. So, how does this all work?

First, the project loader get's a request to load our project Test. It has a list of search-paths to look for projects. If we haven't otherwise specified, that list will only contain the parent directory of our running project, so if Test resides directly on C:, the project loader will have one search path, which is C:\.

The project loader combines the requested name with each of the directories in the search-paths list in order, and check to see if a project.json resides at that location. In our case, it will construct the path C:\Test\ and look for project.json in that folder. If it finds one, it loads it, and parses it as JSON.

The project.json can contain a few different things, and I'm not going into details on what it can and can't contain here (in practice it can contain anything that's valid JSON, it'll just ignore most of it), but we'll look into some of the key values later.

After parsing the project.json file, the project loader delegates the actual loading of the assembly to the RoslynAssemblyLoader (henceforth known as the Roslyn loader).

Lastly, the Roslyn loader constructs a compilation, using Roslyn (Microsoft.CodeAnalysis). The compilation is handed the sources and references, and a compilation is run, producing the final assembly, which is returned to whatever requested the "load". And this concludes the work of the project loader.

Only one language

One of the "problems" with how the project loader worked a few versions back (which is basically what I described in the previous section) is that it only supports one language, namely C#. This is caused by the fact that the project loader which reads the project.json file always delegates the loading to the Roslyn loader, which generates a CSharpCompilation, and compiles the project as if it was C#, no matter what. At least that's how it used to be, until @davidfowl added support for custom loaders. Now, custom loaders does not mean you inject your own loaders into the Runtime, and it sits side by side with the project loader and NuGet loader and the other loaders, instead it means that the project loader can delegate the compilation work to a user specified loader, discovered at runtime, instead of the Roslyn loader. The way this works is by adding a loader key to the project.json pointing at a "assembly" and a type in said assembly that does the loading instead of the project loader. I wrote assembly in apostrophes because just like your project's "assembly" is really just sources that get's "loaded" (compiled in this case) at runtime, so too can the custom loader itself be. Or it can be a Nupkeg, or a GAC assembly (heaven forbid), or anything else that's loadable! An examle on just how crazy this can be can be found later in this post.

CSC Loader

As I've already said a few times, the Roslyn loader uses the magic of Roslyn to enable efficient in-memory compilation of C# projects. However, if we wanted to be really backwards, and use an old version of the C# compiler, namely CSC, we totally could. In fact, @davidfowl has already done so. It can be found at CustomLoader over at GitHub. The CSC loader works a lot like the Roslyn loader, in that it's handed a list of .cs files, and references. However, instead of using Roslyn to compile the C# files in-memory, it uses Process.Start to invoke CSC. Now, the question is obviously "why would you ever want to use something like this?". And the answer is probably "you wouldn't", but it's still a good reference if one wants to create new custom loaders.

Usage

Using a custom loader is simple. Simply go into your project.json file and add a loader key to the root-object. The value of loader should be an object consisting of name, which is the assembly name of the assembly containing the loader, and type which is the type name of the loader. It's also important to add the loader to the dependencies of your project. This both enables fetching of the loader and it's dependencies (if it's a NuGet package) during kpm restore, as well as making the dependencies available for the runtime. Otherwise the loader will probably crash giving you exceptions of the namespace "SomeNameSpace" not found sort.

An example project.json (stolen from @davidfowl) looks like this:

{
    "dependencies": {
        "Loader.FSharp": ""
    },
    "code": "**/*.fs",
    "loader": {
        "name": "Loader.FSharp",
        "type": "Loader.FSharp.FSharpAssemblyLoader"
    },
    "configurations" : {
        "net45" : { }
    }
}

Custom loader limitations

There are however some limitations to using custom loaders as of today, one of which is that you cannot build your applications to assemblies on disk using kpm build. So there is currently no way to automatically get NuGet packages of your projects that requires custom loading. This however, will be addressed in the future.

Loading my custom loader, using a custom loader, using a custom loader...

Here comes the fun part. The fact that these loaders, are just projects themselves, just as all the other projects, means that the loaders themselves can be loaded using a custom loader (though obviously not itself). So, for instance, if you wanted to create a custom loader to load F# files in F#, what you could do (and I have no idea weather or not this would be a good idea) is to create a simple FSharpBootstrapLoader in C#, that simply calls into FSC and returns the resulting assembly. Then you could create a FSharpLoader in F#, that is loaded using the FSharpBootstrapLoader, and has support for in-memory compilation, and fancy stuff that you'd want for the loader that people are going to use. What would happen if I added a project in F# (named TestFS) which used the FSharpLoader is the following:

  1. Application does Load("TestFS").
  2. The system tries to load TestFS using the project loader.
  3. TestFs/project.json is found, and parsed. The loader to use is found to be FSharpLoader.
  4. The project loader does Load("FSharpLoader").
  5. The system tries to load FSharpLoader using the project loader.
  6. FSharpLoader/project.json is found and parsed. The loader to use is found to be FSharpBootstrapLoader.
  7. The project loader does Load("FSharpBootstrapLoader").
  8. The system tries to load FSharpBootstrapLoader using the project loader.
  9. FSharpBootstrapLoader/project.json is found and parsed. It does not specify a custom loader, thus the default one will be used.
  10. Roslyn compiles FSharpBootstrapLoader.
  11. The compiled version of FSharpBootstrapLoader is returned to where it was requested.
  12. FSharpBootstrapLoader is used to compile FSharpLoader.
  13. The compiled version of FSharpLoader is returned to where it was requested.
  14. FSharpLoader is used to compile TestFS.
  15. The compiled version of TestFS is returned to where it was requested.

Loaderception! (Sorry, couldn't resist).

An interesting thing to note here is that none of the loaders are actually loaded until they are actually needed. The same goes for Roslyn. So if you create a project where everything is precompiled, Roslyn (or any other custom loader for that matter) will never get loaded, which reduces memory and cpu-usage.

Now, unfortunately I still haven't gotten any comment-system on this blog, cause I need to rewrite the theme I use first, so if there are any questions please send me an email at alxandr <at> alxandr <dot> me.

Next up I plan to write a bit about implementing the custom loaders, unless I find that it ends up as just being a bunch of code, and no actual blogging, in which case @davidfowl's samples should suffice. If there's anything else you'd like me to write about, be it vNext, javascript, or anything else that's somewhat related to programming, send me an email too.